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The long road back 


For now, Japan’s scientists have higher priorities than rebuilding their research infrastructure, but 
when they do get to it, they will need help from the international scientific community. 


earthquake, tsunami and devastation at a nuclear-reactor plant 

the worst disaster to hit the country since 1945. The death toll 
has soared past 20,000, and the full extent of the damage won't be 
known for some time (see News page 420). 

The situation also been a catastrophe for science in Japan. Labora- 
tories have been destroyed, and Tohoku University in Sendai, one of 
Japan's premier research institutions, will be closed until at least the 
end of next month. Many buildings can’t be entered, and broken equip- 
ment and destroyed samples fill those that can. The impact stretches 
down the east coast to Tsukuba Science City, where 40% of the coun- 
try’s researchers are based. Even in the greater Tokyo area, where most 
facilities escaped physical damage, much research has come to a halt 
because of blackouts and an exodus of expatriate researchers, who have 
left because of worries about radiation. 

Alongside the humanitarian aid that has poured in from dozens of 
countries, scientists around the world are offering to help their Japa- 
nese colleagues. Some have approached acquaintances, whereas others 
are taking more formal approaches. 

The US National Institutes of Health is planning to provide 
temporary scientific homes for those who have lost research facili- 
ties in Japan. The Nippon Science Support Network, supported by 
Nature Network, is helping to coordinate scientific relief efforts from 
Germany; as of Tuesday, the site had 18 offers of scientific positions 
and other services such as computer-server space, many of them 
fully funded, ranging from mathematics to molecular pharmacology 
and plasma and astrophysics. An international grass-roots initia- 
tive is collecting small-scale support in the form of accommodation, 
funding, lab space and server space. The German national academy 
of sciences, the Leopoldina in Halle, the German Academy of Science 
and Engineering in Berlin, and the Berlin-Brandenburg Academy of 
Sciences and Humanities have offered €5 million (US$7.1 million) 
to support Japanese science. A group at the Chinese National Center 
for Nanoscience and Technology in Beijing has offered to host scien- 
tists. And institutions within Japan are discussing loans or donations 
of instruments. 

Those making these generous offers shouldn't be surprised if they 
are not taken up just yet. Many affected scientists don't have consistent 
access to the Internet, and most are more concerned with the necessities 
of daily life. Before going abroad, scientists are trying to work out what 
they can salvage of their laboratories at home. And although the best 
thing for their research might be to move to a facility where basics such 
as water and electricity are not a problem, many — especially senior 
researchers — have work or family obligations in Japan. 

This will change as researchers size up what they need to do to recon- 
struct their research. The best thing the scientific community can do is 
to keep these offers open. Young Japanese researchers, especially, should 
be ready to take advantage, and this could be a timely boost for the 


r | he Japanese prime minister, Naoto Kan, has called this month’s 


nation’s science: in the past ten years, the number of young researchers 
travelling abroad has fallen dramatically, producing an insular research 
community that could benefit from more outside contact. 

The destruction also brings research opportunities. The devastation 
shows the power of tsunamis and earthquakes and the seriousness of 
energy shortages, and associated medical problems could bring home 

the importance of science to a generation of 


“Before going uninterested school children, whose curric- 
abroad, ulum currently contains less science than at 
scientists are any time in recent history. Ryoichi Matsuda, 
tryin gto work a biologist at the University of Tokyo, suggests 
out what they that the tragedy could be used to re-emphasize 
can salvage at “science education for survival”. 

home.” Rebuilding also has its benefits. Scrapping 


of old nuclear reactors will open discus- 
sions of other options, such as the introduction of geothermal energy. 
Tohoku University administrators are talking about improving the 
institution’s infrastructure, which could see wider refurbishment of 
outdated facilities. 

Ina time of such death and devastation, the scientific infrastructure 
will, of course, not be the country’s first priority. Scientists throughout 
Japan are preparing for cuts to help the northeast get back on its feet. 
But the nation cannot survive without science and technology. 

The Japanese government will, no doubt, step up to the challenge 
of rebuilding its science, but there will be a long struggle to build a 
solid foundation, and many research lives could fall through the cracks. 
Those who are creating windows of opportunity for Japan’s needy sci- 
entists should keep them open. And others might want to think about 
opening more. m 


Contact your MP! 


British readers should help to change libel laws 
that suppress global scientific discussion. 


ofa gathering by an array of campaigning and media organiza- 
tions, who came together to press for reform of the libel laws of 
England and Wales. Amnesty International UK, Global Witness, Face- 
book, Mumsnet, the British Medical Journal and Nature were among 
those represented, urging a group of Members of Parliament (MPs) 
to support the introduction of new legislation. 
The few dozen MPs at the meeting needed little persuading — but 
there are more than 600 others, and more proposed legislative issues 


B ritain’s Houses of Parliament were earlier this month the scene 


24 MARCH 2011 | VOL 471 | NATURE | 409 


© 2011 Macmillan Publishers Limited. All rights reserved 


| THIS WEEK | EDITORIALS 


than can be handled within the current Parliament, before the next 
general election. So the fight for the attention of MPs will be crucial 
in the coming months. It isa fair bet that few of them, if any, have ever 
heard from constituents on this issue. But they do pay attention to 
their mailboxes, and there is now both an opportunity and a need for 
anyone concerned about these issues to help to ensure that essential 
legislative reform is seen through. 

The good news is that all three of Britain’s main political parties 
support such reform. Over the years, it has become increasingly clear 
that the burden of libel litigation falls too heavily on those who write 
about misconduct and bad practice. And with the global nature of 
the Internet, anyone outside the United Kingdom can sue anyone else 
outside the country using this law, provided that the libel was acces- 
sible to readers in the United Kingdom. Journalism on scientific issues 
has been acutely affected, and the need for reform and examples of 
problems within science were highlighted in these pages last year (see 
Nature 464, 1104; 2010). 

At Nature, we have too often been hindered in our core mission 
because of legal risks. On one occasion we were unable to link to a uni- 
versity’s website to point our readers to the outcome of a misconduct 
investigation, associated with the retraction of a research paper, 
because of a threat from the person found guilty by the university. 
There has been journalism about misconduct — central not only to the 
interests of Nature’s readers but also to public trust in science — that 
we have decided not to commission, because we decided that the risks 
of costly libel action outweighed the undoubted significance of the 
stories. We will always pursue the most significant cases of transgres- 
sion — on one occasion at very considerable legal expense. But there 
is a layer of less egregious yet still significant misconduct that we are 
not covering because of the risks of such costs. 

Britain's coalition government has now introduced draft legislative 


reform that would allow us to perform our core mission with fewer 
restrictions. Part of the problem with the existing libel laws is that they 
place a heavy burden of proof on the defendants and have little scope 
for a public-interest defence — areas that will be retuned under the 

proposed reform. 
Ministers responsible for the proposed changes specifically men- 
tion the freedom of scientific debate as one focus of their concern. 
The changes and a consultation paper can be 


“The attention found at go.nature.com/o3vw5r. 

of Members We at Nature will respond to that consul- 
of Parliament tation. And we urge readers to respond to 
needs to be the detailed questions if they are seriously 
sharpened interested in strengthening the ability of sci- 


now.” entists, journalists and others to report and 

comment publicly on misconduct or to speak 
responsibly and freely about problematic products or actions of large 
institutions and companies. The Libel Reform Campaign — a coalition 
of interested organizations — has published an initial response to the 
draft bill making clear their view of what further changes should be 
sought (see go.nature.com/vdjvna). 

The consultation process has a deadline of 10 June 2011. The bill 
will then be amended and formally introduced to Parliament for 
implementation in 2012. That will be another moment at which sup- 
port for the bill will be crucial. But the attention of MPs needs to be 
sharpened now. To that end, readers in the United Kingdom should 
immediately contact their MPs to draw attention to the issue and to 
urge their support. The organization Sense About Science has pub- 
lished a template letter and MPs’ contact details — see go.nature. 
com/xrdcfx and ask MPs to sign Early Day Motion 1636, tabled by 
Cambridge MP and scientist Julian Huppert and supported by cross- 
party colleagues. = 


@ e@ 
A unifying cause 
Conference of science journalists can strengthen 
ties between the Arab world and the West. 


ith the recent awakening in the Arab world of movements 
We democracy and free speech, it is timely that the World 

Conference of Science Journalists (WCSJ), on 27-29 June, 
will for the first time be held in an Arab country. Even organizing the 
conference in Qatar has, in its own small way, promoted collabora- 
tion between the Western and Arab journalists involved. It can only 
be hoped that the mingling of science reporters at the event will have 
a similar, and lasting, effect. Western journalists attending the con- 
ference should take the opportunity to see the Middle East, meet its 
scientists and learn more about how science might contribute to sus- 
tainable development of the region, and the substantial challenges it 
faces, in particular at this crucial and historic moment in the region's 
history. Support for science in the Arab world has long been at levels 
far below those in other countries, although there have been some 
recent improvements (see Nature 470, 147-149; 2011). A twinning 
between the young Arab Science Journalists Association and the well- 
established US National Association of Science Writers in 2007 made 
the joint bid to bring the conference to the Arab world possible. That 
twinning also built powerful ties between science journalists in the 
Arab world and in the United States. Arab journalists were invited to 
American science and science journalism conferences, and American 
journalists attended the first Arab science journalists conference in 
2008. There was much to learn for both sides as they shared challenges, 
advice and opportunities. It created mutual understanding between 
two regions that are often perceived as being at odds with one another. 
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It is a great pity, although understandable given the recent unrest 
and uncertainty in Egypt, that the organizers decided to relocate the 
conference from its original planned venue in Cairo to Doha in Qatar. 
It would have been symbolic to hold a major conference of journalists 
ina nation that has just overthrown the shackles of a dictatorship that 
repressed free speech and the critical thought and questioning that sci- 
ence and science journalism thrive on. But at least the venue has been 
kept in the Arab world, and has not been moved to the United States, 
which was discussed as an alternative venue at one point. 

Holding the conference in Qatar will hopefully also provide a boost 
to science journalism in the region, which has suffered, as has all jour- 
nalism and civil society, under authoritarian regimes. There are no 
dedicated science journalism courses in any of the universities in Arab 
states and, although there have been improvements, much of the sci- 
ence journalism there remains poor. The conference is a chance for 
Arab science journalists to rub shoulders with colleagues from all over 
the world and exchange their experiences. The connections made will 
be invaluable as science becomes more global. Many local and regional 
organizations are now thinking about projects they can put together to 
train and support science journalists. This will create a momentum to 
support the profession long after the conference has come and gone. 

Past conferences have catered too much to Western issues, but this 
year’s WCSJ, with a rich programme and speakers from more than 
40 countries, promises to begin providing greater balance. Speak- 
ers from the Arab World, Africa, Latin America and Asia will give 
delegates greater insights into the science needs and challenges of 
the developing world. There is much reconstruction of civil soci- 
ety to do in the fledgling democracies of Tunisia and Egypt, and 
science journalism can play its own small part 
in prompting debate on crucial science-based 
issues in every sector, as well as bringing greater 
scrutiny to the glaring needs in research and 
higher education. m 
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risks inherent in this essential energy source. But it should not 

divert nations from using or pursuing nuclear power to gener- 
ate electricity, given the threat from climate change, the health hazards 
of fossil fuels, and the undeveloped state of renewable energy. Instead, 
the events at the Fukushima Daiichi Nuclear Power Plant should turn 
more attention to ensuring that nuclear power plants meet the highest 
standards of safety and protection against natural disasters. 

More than 30 nations have commercial nuclear power plants. 
A further two dozen are interested in having them, including several 
in earthquake risk areas such as Indonesia, Malaysia and Turkey. 

Some nations are pro-nuclear for energy security; some for prestige. 
Others, including Iran, have invested in nuclear power because they 
may want the capability to make nuclear weapons. 
These nations are seeking to acquire uranium 
enrichment or reprocessing technologies: use- 
ful either for producing fuel for peaceful nuclear 
reactors or fissile material for nuclear bombs. 

Although some national leaders profess to be 
interested in nuclear energy because operating 
plants do not emit greenhouse gases, this is 
usually a secondary motivation. If it were their 
primary concern, nations would invest far more 
than they have in measures such as energy 
efficiency and solar and wind technologies. 

The Japanese crisis has affected three impor- 
tant criteria: public opinion, safety and economic 
costs. Governments and utilities have had to 
grapple with these for decades. Now they must 
renew their efforts to finance expensive nuclear 
projects and ensure that existing and future nuclear plants maintain 
the highest standards — and must be seen to do so by the public. 

Building nuclear power plants has always been expensive. For a large 
reactor with a power rating of 1,000 megawatts or greater, the capital 
cost ranges from US$4 billion to $9 billion depending on reactor design, 
financing charges, the regulatory process and construction time. The 
recent nuclear crisis is likely to change all of these, pushing up costs. 

Contemporary plant designs — ‘generation II? — have better 
safety features than the 1970s-era generation II designs for the 
Fukushima reactors, making them more expensive. Some, such as 
the AP1000 designed by the Westinghouse Electric Company, head- 
quartered in Cranberry Township, Pennsylvania, have passive safety 
features that do not require technicians to activate emergency systems 
or electrical power to ensure safety after a mishap. Others, such as 
Paris-based Areva’s EPR, have advanced active 
safety systems designed to prevent the release of NATURE.COM 
radioactive material to the environment. Further _ Discuss this article 
designs, such as the pebble-bed modular reactor, _ onlineat: 
may prevent nuclear fuel from ever experiencing —_go.liatuire.com/jjm47y 


r | he ongoing Japanese nuclear crisis underscores yet again the 


GOVERNMENTS AND 
INDUSTRY MUST HAVE 


AN HONEST 
CONVERSATION 
ABOUT THE ROLE 


OF NUCLEAR 
ENERGY. 


Do not phase out nuclear 
power — yet 


Fission power must remain a crucial part of the energy mix until renewable 
energy technologies can be scaled up, argues Charles D. Ferguson. 


a meltdown. Concerns were raised about the Fukushima designs 
as early as 1972, the year after reactor unit 1 began operations. But 
the nuclear industry opposed shutting down such reactors because 
32 were in operation worldwide — about 7% of the world’s total. 
Almost one-quarter of the reactors in the United States are of this 
type. The remaining plants of this design should undergo a thorough 
safety review and, as a result, some may need to close. Since the 
crisis began, several governments, including China, Germany and 
Switzerland, have called for increased scrutiny of their plants and 
a moratorium on plant construction until plant safety is assured. 
Germany has also shut down its seven oldest reactors. 

But phasing out nuclear power worldwide would be an overreaction. 
It provides about 15% of global electricity and even larger percentages 
in certain countries, such as France (almost 80%) 
and the United States (about 20%). Eliminating 
nuclear power would lead to much greater use of 
fossil fuels, and raise greenhouse-gas emissions. It 
will probably take at least a few decades to mass- 
ively scale up use of renewable sources. Mean- 
while, nuclear plants can bridge the energy gap. 

So governments need to take practical actions 
to improve nuclear safety. All new nuclear plants 
should have enhanced safety systems, and plant 
designs that eliminate or substantially reduce 
the risk of a meltdown of fuel should be devel- 
oped. Existing plants deemed to fail improved 
safety standards should be retrofitted or, when 
necessary, phased out. Further, governments 
must force their nuclear providers to remove 
spent fuel — typically after five years of cooling 
— from storage pools and place it in dry cask storage. As the world 
witnessed, spent fuel in the overcrowded above-ground cooling pools 
at Fukushima Daiichi became exposed to the air. If spent fuel catches 
fire, radioactive materials can be widely dispersed. 

Because of decreased public confidence following the Japanese 
accident, governments and industry must have an honest conversation 
about the role of nuclear energy in meeting consumers’ electricity 
demands, the typically high safety record of almost all plants and the 
risks of this technology. These discussions must implement one of the 
primary lessons of the Japanese accident: that officials should dra- 
matically increase transparency of nuclear operations. Simultaneously, 
nations need to invest far more in renewable energy sources, which 
offer the path to a truly sustainable global energy system. m 


Charles D. Ferguson trained as a nuclear engineer and a physicist, 
and is president of the Federation of American Scientists in 
Washington DC and author of the forthcoming book Nuclear Energy: 
What Everyone Needs to Know (Oxford University Press, May 2011). 
e-mail: cferguson@fas.org 
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RESEARCH HIGHLIGHTS 


Liver cancer 
lifeline found 


A survey of liver tumours has 
highlighted a gene that many 
such tumours depend on for 
survival. 

Scott Lowe and Scott 
Powers at Cold Spring Harbor 
Laboratory in New York and 
their colleagues searched the 
genomes of 89 liver tumours 
and 12 liver-cancer cell lines. 
They identified 124 genes that 
were sometimes expressed 
in excess; overexpression of 
18 of these caused liver cells 
transplanted into mice to 
become cancerous. 

In particular, cells 
overexpressing the gene 
FGF19 became dependent on 
this expression. An antibody 
that blocks the FGF19 protein 
inhibited the growth of 
these cells, suggesting that 
patients in whom this gene is 
overexpressed could benefit 
from therapies that block the 
protein. 

Cancer Cell doi:10.1016/j. 
ccr.2011.01.040 (2011) 


CELL BIOLOGY 


Plot twist 
for proteins 


Cells interpret external 
chemical signals through 
membrane-spanning 
receptors that bind the 
chemicals and change shape 
to alter the cells’ activities. 
Raymond Stevens at the 
Scripps Research Institute 
in La Jolla, California, and 
his colleagues now report 
the structure of one of 

the G-protein coupled 


receptors (GPCRs) bound ¢ Wi? @™ 


to an activating drug. 
The authors used 
X-ray crystallography 
to reveal the shape of 
the A,, adenosine 
GPCR (pictured) ( 


Selections from the 
scientific literature 


ECOLOGY 


Better fragmented than lost 


Separating the effects of habitat loss and habitat 
fragmentation is difficult. To solve this problem, 
Mary Bonin and her colleagues at James Cook 
University in Townsville, Australia, arranged 

a series of experimental reefs off Papua New 
Guinea. 

Few damselfish survived on reefs of Acropora 
subglabra (pictured) from which 75% of coral 
had been removed, whereas those on reefs that 
had been broken up but maintained in area 


bound to an agonist called 
UK-432097. This is the first 
time an agonist has been 
shown to bind to and stabilize 
the receptor without the aid of 
a G protein. 


PHYSICS 


An ‘electric’ force 
for neutral atoms 


Neutral atoms can be made 


actually did better than those on untouched 
control reefs. Species richness and abundance 
were also higher on fragmented reefs than on 
those that had lost habitat. Although the positive 
effect of fragmentation declined over a 16-week 
period, the impact of habitat loss worsened in 
this time, suggesting that reported declines in 
fish populations after habitat disruption are 
down to the latter and not the former. 

Ecology doi:10.1890/10-0627.1 (2011) 


at the National Institute of 
Standards and Technology 
in Gaithersburg, Maryland, 
have previously generated 
synthetic magnetic fields by 
spatial alteration of a time- 


Better understanding 
of GPCR structure and 
function could help 
scientists to develop 
treatments for 
conditions such as 
chronic obstructive 
pulmonary disease. 
Science doi: 
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to behave like charged 
particles by ‘synthetic electric 
and magnetic fields. These 
are created through the 
production of a synthetic 
gauge field in a state of matter 
known as a Bose-Einstein 
condensate (BEC), in which 
the atoms are all identical and 


10.1126/ behave collectively as if they 
science.1202793 were one ‘superatom. 
(2011) Ian Spielman and his team 
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independent electromagnetic 
vector potential — an entity 
that can be used to specify both 
the electric and the magnetic 
fields. Now they show that a 
synthetic electric field can be 
generated in a rubidium BEC 
ina parallel manner — by 
changing a spatially uniform 
vector potential over time. 
The neutral atoms in the 
condensate were accelerated 
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TAO IMAGES/ALAMY 


like charged particles in the 
synthetic electric field that 


these organolithium reagents 
in a highly selective manner. 


was induced by changing Using these reagents, the 
the electromagnetic vector team demonstrated selectivity 
potential. of 90% or higher for one 
Nature Phys. doi:10.1038/ enantiomer in more than 20 
nphys1954 (2011) different reactions. 

Nature Chem. doi:10.1038/ 
nchem.1009 (2011) 
Methane rain 
falls on Titan How the chicken’s 
Images from NASAs Cassini neck got naked 
probe revealed vast lakes g 
of liquid hydrocarbons How chickens of a particular 
around the poles of Titan, breed ensure that their necks 


Saturn’s largest moon, in remain feather free — 


2006. Elizabeth Turtle of 


RESEARCH HIGHLIGHTS Mii Sauda 


COMMUNITY 


CHOICE 


Skeleton boosts stud quotient 


That gonadal hormones influence bone 
remodelling has been well documented, 
but it is not the end of the story. Gerard 
Karsenty at Columbia University in New 
York and his colleagues have found that this influence runs in 
both directions — although only in males. 

They show that the hormone osteocalcin, made by bone 
cells called osteoblasts, induces testosterone production by 
testicular Leydig cells, the body's key testosterone factories. 
The researchers demonstrate this in both mouse-cell cultures 


> HIGHLY READ 


on www.cell.com 
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Johns Hopkins 
University’s 
Applied Physics 
Laboratory in 
Laurel, Maryland, 
and her colleagues now 
add the discovery that 
methane probably rains 
on Titan at low equatorial 
latitudes. 

Images of the arid 
equatorial region taken by 
Cassini last October show 
dunes that appear darker after 
clouds had passed overhead. 
This suggests that the clouds 
rained liquid methane, which 
wet the surface, says the 
team. Such rain is thought 
to be seasonal and may play 
a part in dune formation by 
cementing fine atmospheric 
aerosol particles. 

Science 331, 1414-1417 (2011) 


CHEMISTRY 


Copper makes 
for selectivity 


Carbon-based compounds 
attached to a lithium atom 
are widely used in chemical 
reactions to add new 
carbon-carbon bonds to 
molecules. However, these 
organolithium reagents are 
not normally selective enough 
to create only one of two 
possible mirror-image forms, 
or enantiomers, of the same 
molecule. 

Ben Feringa, Syuzanna 
Harutyunyan and their 
colleagues at the University of 
Groningen in the Netherlands 
now show off a way to use 


an adaptation to hot 
weather — has been 
explained. 

Denis Headon at the 
University of Edinburgh, 
UK, and his co-workers 
found that naked neck 
chickens (pictured) have a 
mutation that boosts levels of 
the protein BMP12. Adding 
this protein to neck skin tissue 
cultures from normal chickens 
blocked feather formation, 
although the treatment had 
little effect on skin cultures 
from other parts of the body. 
The skin on chickens’ necks 
naturally contains high levels 
ofa signalling molecule 

called retinoic acid. This, the 
researchers show, primes neck 
skin cells to respond to the 
extra BMP12 made by naked 
neck chickens, preventing 
feather growth. 

PLoS Biol. 9, e1001028 (2011) 


Melting of the 
third pole 


Aerosols such as black carbon 

and dust particles seem to have 

a greater effect on the Tibetan 

Plateau’s snow than does 

anthropogenic climate change. 
Yun Qian of the Pacific 


and live mice, and also identify an osteocalcin receptor 
expressed in Leydig cells but not in follicular cells of the ovary. 
Male mice engineered to lack this receptor are subfertile, as 
are male mice engineered to lack osteocalcin. 


Cell 144, 796-809 (2011) 


Northwest National 
Laboratory in Richland, 
Washington, and his 
colleagues simulated the 
impact of carbon dioxide and 
aerosols on the snowpack 
using a global climate 
model. They found that the 
deposition of black carbon on 
snow increases the surface air 
temperature by an average of 
1°Cacross the plateau because 
it boosts the absorption 
of sunlight. From April to 
July, black carbon is up to 
four times more effective at 
melting snowpack per degree 
of warming it induces than is 
air warming from increased 
atmospheric carbon dioxide. 
Aerosols have a larger 
effect on the Tibetan Plateau 
(pictured) than on any other 
snow-covered region in the 
world, producing earlier 
snowmelt and affecting 
monsoon trends, the authors 
report. 
Atmos. Chem. Phys. 11, 
1929-1948 (2011) 


MICROBIOLOGY 


Sugary coat 
of armour 


The soil-dwelling bacterium 
Bacillus cereus is a close relative 
of the microbe responsible 

for anthrax, and can causea 
similar illness if it is inhaled by 
people with damaged lungs. 

Olaf Schneewind and his 
co-workers at the University 
of Chicago in Illinois find 
that a pathogenic strain of 
B. cereus harbours two sets of 
genes that encode protective 
sugar coats. These coats 
prevent the bacteria from 
being engulfed and destroyed 
by certain immune cells. One 
gene set produces a protective 
capsule made of the sugar 
hyaluronic acid, whereas the 
other capsule is made of an 
unidentified sugar. 

Knocking out one set of 
the capsule-making genes 
reduced the microbe’s 
virulence in mice; if both were 
knocked out, the bacteria no 
longer caused disease. 

Mol. Microbiol. doi:10.1111/ 
j-1365-2958.2011.07582.x 
(2011) 
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UK libel reform 


Long-awaited reforms to 
English and Welsh libel 

laws were proposed in draft 
legislation published by the 
UK government on 15 March. 
Changes to the existing laws 
could establish greater legal 
protection for scientists and 
journalists wanting to debate 
scientific or medical issues. 
The proposals suggest explicit 
protection for those reporting 
on academic conferences. 

A consultation runs until 

10 June. See page 409 for more. 


US budget limbo 


Researchers in the United 
States face another anxious 
wait before finding out the 
extent of cuts to science 
agencies’ funding in the federal 
budget for the 2011 fiscal 

year. On 18 March, President 
Barack Obama signed a short- 
term continuing resolution 

to fund the government until 

8 April, avoiding a shutdown. 
Until then, the Republican- 
majority House and the 
Democrat-majority Senate will 
continue to thrash out their 
differences on 2011 budget 
cuts (for details, see Nature 
471, 144-145; 2011). 


SOUNDBITE 


66 We’re on 
the verge of 
an Arctic 
ozone hole. 99 


Markus Rex of the Alfred 
Wegener Institute for Polar 
and Marine Research in 
Potsdam, Germany. See 
go.nature.com/pkzgsl 

for more on the Arctic’s 
unprecedented ozone 
depletion this spring. 


i 


Fukushima sparks anti-nuclear protests 


Japan's nuclear crisis has shaken public 
confidence in the safety of atomic energy, but 
most governments are reluctant to pull the plug 
on nuclear plans. Instead they are promising to 
rethink strategies and undertake safety checks. 
European Union leaders decided that existing 
reactors in the region should undergo stress 
tests, which are voluntary. Facing public protests 
(pictured), Germany temporarily switched off 
its seven oldest reactors and put a three-month 


Elite spared cuts 
Research-intensive 
universities in England will be 
cushioned from cash cuts to 
the nation’s higher-education 
grants. Compared with 
2010-11, institutions next 
year will lose on average 3.7% 
of their research and teaching 
grants, which respectively total 
£1.6 billion (US$2.6 billion) 
and £4.3 billion. But 
provisional allocations 
released on 17 March show 
that small, teaching-focused 
institutions will lose up to 10% 
in grants, whereas research- 
intensive universities such as 
Cambridge are to see budget 
cuts of less than 1% — and 
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some, such as Oxford, will 
see increases. See go.nature. 
com/2xgqfj for more. 


| BUSINESS 
Celera sold 


Pioneering genetic sequencing 
company Celera has been 
sold to the medical giant 
Quest Diagnostics for 

US$671 million, it announced 
on 18 March. In 1998, 

Celera boasted that it would 
sequence the human genome 
within three years — and 
then succeeded. Since then 
the company, founded by 
geneticist Craig Venter and 
based in Alameda, California, 
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moratorium on plans to extend reactor life- 
spans. Although Britain, Russia, France, the 
United States and India all announced safety 
checks of existing plants last week, none is 
delaying expansion plans. China, however, 
announced plans to temporarily suspend work 
on 26 reactors under construction, and to 
reconsider long-term expansion plans involving 
at least 50 more reactors. See go.nature.com/ 
smé6blc and page 411 for more. 


has focused on developing 
genetic tests for diseases. 

The deal will give Quest 
Diagnostics, of Madison, 

New Jersey, access to Celera’s 
pipeline of disease biomarkers. 


Solar subsidy cuts 
The United Kingdom has 
signalled its intent to join 
Spain, Germany, France 

and Italy in reducing state 
incentives for solar-power 
production. Ina consultation 
document published on 

18 March, the UK Department 
of Energy and Climate Change 
suggested cutting feed-in tariffs 
— the price that an electricity 
utility must pay to generators 


REUTERS/R. ORLOWSKI 


T. GLASS/WWW.TRISTANDC.COM 


SOURCE: EEX 


of solar energy — because 

too many large-scale projects 
were being planned. Under 
the revised system, subsidies 
for large arrays of photovoltaic 
panels would be slashed by 
50-75% from 1 August. 


| ERESEARCH 
Arthropod project 


A consortium of US and 
European scientists hopes 

to sequence the complete 
genomes of 5,000 species of 
arthropod within 5 years. The 
iSK Arthropod Sequencing 
Initiative will focus on 
economically important 
insects, disease vectors, model 
organisms and representatives 
from all the evolutionary 
branches of insects and related 
species. Currently, fewer than 
50 arthropod genomes have 
been sequenced. The initiative 
was formally announced last 
week (G. E. Robinson et al. 
Science 331, 1386; 2011), and 
species can be nominated 

for sequencing at www. 
arthropodgenomes.org/ 
wiki/i5K. 


Penguins in peril 
Conservationists are warning 
of an ecological disaster in 
the south Atlantic after the 
cargo vessel MS Oliva broke 
up near Nightingale Island 
on 18 March, spilling heavy 
crude oil and threatening 
penguins already classed 

as endangered. The island 


TREND WATCH 


Jumps in the prices of electricity 
and European carbon permits 
accompanied the German 
government's 15 March decision 
to shut the country’s seven oldest 
nuclear reactors, and to suspend 
the planned extension of licences 
to operate other nuclear plants. 
To compensate for the drop in 
nuclear energy supply, utilities 
companies expect to switch to 
producing electricity from more 
carbon-intensive gas and coal, 
thus boosting demand for future 
allowances to emit carbon dioxide 
on the Emissions Trading System. 


forms part of the Tristan da 
Cunha group, which is UK 
territory; tens of thousands 

of Northern Rockhopper 
penguins (Eudyptes moseleyi) 
live there. A slick from leaking 
oil extends 13 kilometres 
offshore, and oiled penguins 
(pictured) have been seen on 
the shores. See go.nature.com/ 
z11rlp for more. 


Alone in space 


The next large science mission 
for the European Space 
Agency will have to make do 
without funding from NASA, 
the agency has decided. The 
‘L-class mission, to launch 
around 2020, will be selected 
in February 2012 from three 
competitors, and will see its 
budget capped at €700 million 
(US$996 million), about half 
of what had been hoped for. 
See page 421 for more. 


Orbiting Mercury 
NASA's MESSENGER 
spacecraft has become the first 
probe ever to orbit Mercury. 
The craft slipped into orbit 
around the Solar System's 
innermost planet on 18 March, 


after a 6.5-year flight that 
included three earlier flybys 
of its target. MESSENGER 
will conduct a comprehensive 
one-year survey of Mercury, 
studying surface features as 
small as 18 metres across, 

and searching for hints of ice 
within permanently shadowed 
craters near the planet’s poles. 
It will also make magnetic- 
field measurements that could 
reveal structural details about 
Mercury’s iron core. NASA 
expects the probe to start 
beaming imaging data back to 
Earth from 4 April. 


Crop genetics 


An international treaty, whose 
127 signatories pledge to share 
genetic information about food 
crops, has secured more than 
US$10 million in donations 
for a second round of research 
grants, aimed at conserving 
global food security. Grant 
winners will be announced in 
May. Best known for its role 

in enabling the construction 
of the Svalbard Global Seed 
Vault in Norway, the treaty had 
struggled to gather research 
funds. The latest donations 
were announced at a meeting 
last week in Bali, Indonesia. 
See go.nature.com/wyygdg 

for more. 


Fire above lab 

A major underground 
laboratory at the bottom of the 
Soudan Mine in Minnesota 
was not seriously harmed by a 


NUCLEAR CRISIS RATTLES EUROPEAN ENERGY MARKETS 


Electricity and carbon-permit prices rose as investors worried that civil 
nuclear programmes might stall after the scare at Fukushima, Japan. 
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SEVEN DAYS | THIS WEEK | 


27-31 MARCH 

The American Chemical 
Society holds its spring 
meeting in Anaheim, 
California, focusing on 
the chemistry of natural 
resources. 
go.nature.com/egqhue 


28 MARCH 

The Royal Society in 
London releases a report 
mapping how science is 
done around the world, 
and how these patterns 
are changing. 
go.nature.com/ivabqu 


fire that broke out on 17 March 
in alift shaft serving the lab. 
However, physics experiments 
there were suspended this 
week as crews worked to 
restart ventilation systems and 
pumps after quenching the 
fire. Among the high-energy 
experiments hosted by the 
University of Minnesota lab 

is the Cryogenic Dark Matter 
Search, which looks for signals 
of dark matter passing through 
crystals of germanium and 
silicon. See go.nature.com/ 
ma2spj for more and page 433 
for a News Feature about 
dark-matter detectors. 


Clinical-trial access 
The European Medicines 
Agency has started to allow 
public access to its database 

of clinical trials. Information 
on interventional trials that 
are being carried out in all 

27 European Union (EU) 
member states, as well as in 
Iceland, Liechtenstein and 
Norway, will be searchable in 
the EU Clinical Trials Register 
(go.nature.com/cacwil). As 
part of transparency measures, 
the site will gradually publish 
information from EudraCT, 
the EU’s database of clinical 
trials dating from May 2004, 
which is not publicly available 
at the moment. 


> NATURE.COM 
For daily news updates see: 
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TOKYO ELECTRIC POWER CO./PRESS ASSOCIATION IMAGES 
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shook up the nation’s 
science infrastructure p.420 
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breaks out over cattle and 
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GENOMICS What is the 


a genome? p.425 


best way to assemble 


CANCER DIAGNOSIS The 
troubled search for a 
» biomarker p.428 


Smoke rising from a reactor at the Fukushima Daiichi nuclear plant meant workers had to be evacuated. 


The meltdown 
that wasn’t 


How a handful of operators at a crippled reactor averted a 
greater catastrophe at the Fukushima plant. 


BY GEOFF BRUMFIEL 


he magnitude-9.0 earthquake rocked 
[Tres Daiichi nuclear power 
station at 2:46 p.m. on 11 March, but 
the real emergency began an hour later. A wall 
of water swept across the site, washing away 


power lines and the fuel tanks for the emer- 
gency backup generators designed to take over 


if grid power failed. Inside the control room of 
the unit 1 reactor, the lights went out and the 
1970s-vintage analogue gauges drifted to zero. 

It will probably be years before anyone 
knows exactly what happened inside the three 
reactors at Fukushima Daiichi that seem to 
have partially melted down in the wake of 
the tsunami. But from press reports, public 
statements and interviews with experts, it is 


possible to work out the most likely scenario. 
And already it is clear that decisions made in 
the initial 24 hours by the handful of operators 
in the control room probably averted a much 
greater nuclear catastrophe than the one that 
now faces Japan. 

In the moments after the power was lost, 
the operators “would have literally been blind’, 
says Margaret Harding, a nuclear engineer in 
Wilmington, North Carolina. Harding worked 
for two decades with General Electric, which 
designed Fukushima’ boiling-water reactors, 
and she witnessed a similar outage in 1984 
during a safety test at a boiling-water reac- 
tor in Switzerland. “Basically the emergency 
lights came on and all the panels went black,” 
Harding says. 

During the Swiss test, the power returned in 
5 minutes. At Fukushima, batteries ran a hand- 
ful of emergency lights in the control room and 
a few instruments tracking the reactor’s vital 
signs, such as the pressure inside the core. 

The core was next door. Inside a large, cube- 
shaped building, enclosed in a heavy concrete 
containment vessel, sat a thick, steel capsule 
filled with around 50 tonnes of uranium. Until 
an hour previously, that fuel had been pumping 
out 460 megawatts of power, but the reactor had 
automatically shut down immediately after the 
earthquake. Boron-carbon control rods driven 
between the long columns of fuel had soaked 
up neutrons and halted the nuclear reactions. 


MODEL RESPONSE 
That didn't mean the reactor was cold. Radio- 
active by-products of the fission reactions 
still generated heat — some 7 megawatts of it, 
preliminary computer models by the National 
Nuclear Laboratory in Sellafield, UK, suggest. 
The fuel still needed to be actively cooled. 
Without power, operators could use steam 
from the reactor’s pressure vessel, plus minimal 
amounts of battery power, to drive a pump that 
would keep the cooling water circulating. What 
they probably didn’t know was that the cooling 
system had sprung a leak. The leak caused water 
levels inside the core to drop, allowing the fuel 
to heat up, which generated more steam and 
raised the pressure inside the steel vessel. The 
emergency cooling sys- 


> NATURE.COM tem was unable to cope, 
For full coverage of according to a press 
theearthquakeand — release from the Tokyo 
nuclear crisis, see: Electric Power Company 
go.nature.com/ulsz2n (TEPCO), the plant’s > 
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| NEWS IN FOCUS 


AN UNFOLDING CRISIS 


At the Fukushima Daiichi power station, the plant hit hardest by the earthquake and 


tsunami, unit 1 was the first of fo! 


14:46 The M9.0 earthquake 
struck; reactors were shut down. 


15:42 Mains power was lost. 


15:45 Oil tanks were 
washed away by tsunami. 


16:36 Emergency core 
cooling system failed in 
unit 1 and unit 2. 


19:03 A nuclear emergency was 
declared at Fukushima Daiichi. 


ur operating reactors to reach a poin 


SAT 12 MARCH 


4:00 Pressure in the 
containment at unit 1 reaches 
840 kPa, twice the design value. 


5:44 Evacuation was 
expanded to a 10-km radius 
of Fukushima Daiichi. 


14:30 Steam was vented 
from unit 1. 


14:49 Radioactive caesium was 
detected around unit 1. 


15:36 Hydrogen explosion took 


Core contains 
plutonium, 
which is 
especially 
dangerous if 
released. 


Spent fuel in 
storage pond 
may have 
melted, 
releasing 
radiation. 


® The explosion 
damaged the 
building but 
left the 
containment 
intact. 


Hydrogen 
explosion 
may have 
damaged the 
containment 
vessel. 


of crisis. 


13:12 Sea water was injected 
into the reactor at unit 3. 


MON 14 MARCH 


4:08 Spent fuel pool at 
unit 4 overheated. 


11:01 Unit 3 exploded. 


16:34 Sea water was injected 
into the reactor at unit 2. 


TUE 15 MARCH 


place at unit 1. 
20:50 Residents living within 

2 km of Fukushima Daiichi were 
told to evacuate. 


A satellite view of the Fukushima Daiichi plant shows heavy damage 
from explosions or fire at three of the four affected reactors. 


> operator. At 7:03 p.m. a state of nuclear emer- 
gency was declared. Less than 2 hours later, 
evacuations began within a 2-kilometre radius 
of the plant. 

By 4:00 a.m. the pressure inside the thick 
steel vessel of unit 1 had reached 840 kilopas- 
cals (kPa), more than twice the operating 
limit, according to the Nuclear and Industrial 
Safety Agency, the Japanese nuclear regula- 
tor. Radiation levels at the front gate of the 
site had begun to rise above background, 
although they were still far from dangerous. At 
5:44 a.m. the evacuation cordon was expanded 
to 10 kilometres. 

At some point, the falling water levels must 
have left the fuel exposed. In a reactor such 
as unit 1, the uranium pellets are enclosed in 
long, skinny pipes made of zirconium alloy, 
chosen because it does not inhibit the neu- 
trons needed to drive the fission reactions. As 
temperatures rose above 1,000 °C, the steam 
in the pressure vessel began to oxidize the 
zirconium, probably releasing hydrogen gas. 
Meanwhile, fuel pellets, liberated from their 
shell, began to fall to the bottom of the reactor. 
The meltdown had begun. 

This was the crucial moment. If the opera- 
tors at unit 1 could not stem the meltdown, the 
fuel would gather at the bottom of the vessel. 
The uranium pellets, now in close proximity, 
could begin exchanging neutrons and resume 
their heat-producing nuclear reactions. Slowly, 
the pile could build towards a ‘critical mass’ 
that would restart the nuclear process normally 
used to generate electricity. 


TURNING POINT 

Nobody can be sure about this sequence of 
events because there has never been a full 
meltdown in a boiling-water reactor. Hard- 
ing says that she thinks it’s unlikely that the 
nuclear processes would have reignited. Even 
if they did, the worst case, in her opinion, is 


20:20 Sea water was injected 
into the reactor at unit 1. 


6:10 Unit 2 exploded. 


9:38 Fire broke out at the 
reactor building of unit 4. 


> 


that the fuel would have burned through the 
steel pressure vessel and splattered onto the 
‘base mat; a thick concrete slab that would 
have spread out the fuel, extinguishing any 
fission reactions. 

But even that might have been catastrophic. 
The volatile hydrogen gas generated by the 
zirconium was safe inside the steel pressure 
vessel, but it was liable to explode if exposed 
to air in the outer containment vessel. If the 
blast were big enough, it might have breached 
the outer vessel’s thick, concrete walls. 

This scenario is highly unlikely, but had it 
happened, the workers struggling to save the 
plant would almost certainly have received a 
lethal dose of radiation, 
says Malcolm Sperrin, a 


“ °. 

The fire medical physicist at the 
trucks err Royal Berkshire Hospital in 
brilliant. Reading, UK. Citizens near 
Pmnot sure the plant could have been at 
I would have higher risk for cancer later 
thought of in life, he says. And the 
that.” contamination would have 


made emergency opera- 
tions much more difficult at the other reactors, 
which were also in trouble. The situation could 
easily have spiralled out of control. 

Just metres away was a vast reservoir of sea 
water. It could stop the reactor’s meltdown, but 
operators had no way to pump it into the core. 
Emergency generators could not be hooked 
into the system, for reasons that are still unclear. 

At some point, somebody on the site real- 
ized that fire engines were essentially giant 
portable pumps with their own power supplies. 
“The fire trucks were brilliant” Harding says, 
“Tm not sure I would have thought of that.” 
Engines were rushed to the plant and hooked 
into the lifeless emergency cooling system. Yet 
there was still a problem: the pressure in the 
core was too high for the engines’ pumps to 
force in the sea water. 
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Around 2:30 on Saturday afternoon, 
operators began to vent pressure from the 
containment vessel. An hour later something 
sparked the gas that had built up inside the 
outer building during venting. The entire top 
of unit 1 was blown away, and four workers 
were injured, although the sturdy concrete 
containment vessel below seems to have 
survived the blast. 


CHAIN REACTION 

The explosion, broadcast around the world, 
was the first ofa series of setbacks at the reac- 
tor complex. In the ensuing days, reactors 3 
and 2 followed a similar path to unit 1 (see ‘An 
unfolding crisis’); each was rocked by a mas- 
sive hydrogen explosion. In units 3 and 4, the 
pools for storing used fuel lost their cooling 
water and it is believed that the rods began to 
melt, emitting more explosive hydrogen along 
with powerful radiation. 

At the time of writing, radioactive material 
from Fukushima Daiichi continues to blow 
across Japan at levels high enough to cause 
concern for Sperrin — although he says that 
they are not immediately dangerous. In the 
coming weeks and months, the government, 
TEPCO and safety authorities are likely to 
face heavy criticism. People will ask what 
went wrong. 

Still, at unit 1 the immediate crisis has 
passed. With the pressure down, fire engines 
began to flood the reactor with sea water 
at 8:20 p.m. on 12 March, allowing the fuel 
to slowly cool to a safe temperature. The 
response at unit 1 also provided a model for 
stabilizing the other two reactors. And day by 
day, the radioactive decay in the reactor cores 
is ebbing. It could be days or weeks before 
the reactors are truly safe, but for now things 
remain stable. 

As for the operators at unit 1, says Harding, 
“I think they really did respond pretty well? m 


DIGITALGLOBE/REUTERS. 


SOURCE: CENTRAL INST. FOR METEOROLOGY AND GEODYNAMICS 
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IN FOCUS | NEWS 


Radiation risks unknown 


Scientists struggle to calculate long-term effects of low-dose exposures in Fukushima. 


BY GWYNETH DICKEY ZAKAIB 


ne thing is certain about the human 
() costs of the radiation leaking from the 
Fukushima Daiichi nuclear plant in 
Japan: they will pale in comparison to the cata- 
strophic consequences of the 11 March earth- 
quake and tsunami that triggered the crisis. 
Nevertheless, experts are tracking radiation lev- 
els worldwide to learn more about the accident 
and to assess the possible impacts on health. 
Radioactive vapour and particles released 
from the plant have spread across the region 
and followed prevailing winds across the Pacific 
(see ‘Plume projections’). “The plume is very 
large,” says Ted Bowyer, a nuclear physicist at 
the Pacific Northwest National Laboratory in 
Richland, Washington, one of the first US sta- 
tions to detect isotopes released from Fukush- 
ima. Bowyer adds that the tiny concentrations 
of radioactive iodine, caesium, tellurium, xenon 
and lanthanum that have reached the United 
States are far below normal background levels 
and nota health risk. The fact that some of the 
isotopes are short-lived indicates that at least 
some of the radiation must have originated from 
breaches in the reactor vessels and not from the 
plant’s overheated caches of spent fuel, he says. 


MORE 
ONLINE 


A farmer destroys spinach in Ibaraki prefecture after it was contaminated by radioactive iodine. 


In Fukushima and adjacent prefectures, the 
Japanese government is reporting radioactive 
contamination in sea water near the plant 
and in the food and water supply. Radioactive 
iodine-131 and caesium-137 have been detected 
in milk and leafy vegetables such as spinach, as 
well as in tap water, in some cases above allow- 
able levels for consumption. Such safety limits 
are based on long-term consumption of these 
foods, says William McCarthy, deputy director 
of the radiation protection programme within 
the Environment, Health and Safety Office 
at the Massachusetts Institute of Technology 
(MIT) in Cambridge. “The prudent thing is to 
not eat that food,” he says. “That doesn’t mean 
it poses immediate health risks” 

Authorities in Japan have banned the ship- 
ment of milk from Fukushima prefecture, as 
well as some produce from Fukushima and 
three neighbouring prefectures. In the short 
term, the main concern is iodine-131, which 
can cause cancer in the thyroid gland. With a 
half life of 8 days, iodine-131 will effectively 
be gone from the environment in a matter 
of months once releases have stopped. But 


caesium-137, another cancer-causing isotope, 
has a half-life of 30 years and will persist for 
much longer. Steve Wing, an epidemiologist 
from the University of North Carolina, Chapel 
Hill, points out that even the low levels of radia- 
tion that remain in the environment could be sig- 
nificant in the long run “because so many more 
people are exposed, even though the dose per 
person decreases farther from the plant”. 

Jacquelyn Yanch, a radiation physicist at 
MIT, thinks that it is too early to say what the 
impact will be. “We haven't come up with risk 
estimates for a situation like this,” she says. “We 
don't know how much is too much.” 

Experts agree that any long-term effects are 
most likely to be seen in the workers battling 
the crisis at the Fukushima nuclear station. The 
government has increased the allowable dose 
for workers from 100 millisieverts per year 
to 250 millisieverts per year — five times the 
annual allowable dose for US radiation workers 
— to allow emergency operations to continue. 
This dose is considered by the US National 
Institutes of Health as the lower limit for the 
first symptoms of radiation sickness. m 
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| NEWS IN FOCUS 


Some buildings at Tohoku University are too damaged to be entered safely. 


Quake shakes 
Japan’s science 


Natural disaster leaves researchers struggling with broken 
equipment and a crippled infrastructure. 


BY ICHIKO FUYUNO 


r he magnitude-9.0 earthquake that 
struck northeastern Japan on 11 March 
trashed Koji Tamura’s laboratory and 

office, flinging books, microscopes, sequencers 

and samples to the floor. The geckos, Xenopus 
frogs and zebrafish that the Tohoku University 
researcher uses to study organ development 
survived the quake but now face a slow death, 
because disrupted water supplies mean their 
tanks may run dry. “Without water, I am worried 
how long our animals can survive,’ he says. “But 

I cannot think about research at a time when 

many suffering people need water to live.” 

Like many scientists in Japan, Tamura is 
both anxious over disrupted research plans 
and heartbroken at the human toll of the 
earthquake and the ensuing tsunami, which 
killed thousands of people and left nearly 
half a million homeless. No casualties have 
been reported on campus at universities and 
research institutes so far. But more than a week 
after the event, scientists taking stock of the 
damage foresee a long, difficult recovery, as 
disrupted infrastructure and power outages in 


Tokyo and other eastern parts of Japan add to 
the physical damage at labs and other facilities. 

The earthquake hit hardest at Tohoku Uni- 
versity, a materials-science, engineering and 
biomedicine powerhouse in the city of Sendai, 
close to the epicentre. At the university, which 
is expected to remain closed until late April, an 
emergency team is assessing the damage, but 
scarce electricity, gas and water, coupled with 
intermittent aftershocks, are making inspec- 
tions extremely difficult. The tsunami flooded 
one building at a field station of the Graduate 
School of Agricultural Science on the coast 
north of Sendai, and six buildings on the main 
campus are too dangerous to enter. 

The university’s WPI-Advanced Institute for 
Materials Research, renowned for its work on 
metallic glasses, polymers and nanodevices, 
has lost ¥1 billion (US$12.5 million) of equip- 
ment, and the cost is likely to increase when 
the damage is assessed in detail, says Yoshinori 
Yamamoto, the institute’s director. Broken 
instruments include some of the world’s best 
electron microscopes and instruments for 
studying the atomic arrangement of surfaces. 

Farther from the epicentre, the Japan Proton 
Accelerator Research Complex (J-PARC), 
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south of Sendai on the coastline of Ibaraki 
prefecture, has been shut down. Its three 
accelerators seem to be intact, and the facil- 
ity escaped damage from the tsunami. But the 
earthquake cut off the water supply, buckled 
nearby roads and damaged computer servers. 
This week researchers will begin switching on 
the facilities for preliminary inspections. 

The High-Energy Accelerator Research 
Organization in Tsukuba has also seen its 
Photon Factory synchrotron crippled. “The 
linear accelerator has seen some substantial 
damages,” Soichi Wakatsuki, director of the 
Photon Factory, wrote to the international 
community on 15 March. But he noted that 
“five protein crystallography beamlines have 
been spared of major damages”. 

Meanwhile, the shutdowns of the Fukushima 
Daiichi reactor complex and other nuclear 
plants after the earthquake have led to elec- 
tricity shortages in Tokyo and neighbouring 
prefectures, where temporary outages are 
planned every day. Many institutions in the 
region, including the University of Tokyo and 
some RIKEN institutes, have been forced to 
drastically reduce electricity use and shut down 
large facilities such as supercomputers. 

Uncertainty resulting from the disrupted 
infrastructure and the nuclear crisis is prompt- 
ing foreign nationals to decamp for cities 
farther south or overseas. “That is the major 
problem,’ says Adrian Moore, a unit leader of 
the Brain Science Institute at RIKEN in Wako. 
Five out of his six non-Japanese postdocs and 
students have left and will not return until the 
situation improves. At J-PARC, all foreign 
researchers have flown home or are being 
housed in dormitories in Tsukuba, where the 
infrastructure held up better. The government 
is considering emergency funding for rebuild- 
ing universities and research institutes. 

Amid the depressing circumstances, there 
are bright notes. On the day of the earth- 
quake, Japan’s research vessel Chikyu, capable 
of drilling seven kilometres into the sea floor, 
was docked in Hachinohe, north of Sendai, 
preparing for a voyage to sample coal beds 
deep under the sea floor. 

Within 30 minutes of the quake, the ship 
undocked with 200 people aboard, including 
48 elementary school pupils on a tour, because 
a ship at sea is safer than the shore during a 
tsunami. The tsunami arrived almost imme- 
diately afterwards, spinning the 57,000-tonne 
vessel 2.5 times but causing no injuries. The 
only damage was to one of the ship’s six thrust- 
ers. “It was almost a miracle that no life was 
lost, says Fumio Inagaki, co-chief scientist of 
the coal-bed expedition at the Japan Agency 
for Marine-Earth Science and Technology. The 
upcoming voyage was cancelled, however. 

Yukihisa Kitamura, executive vice-president 
of Tohoku University, says students and faculty 
members are encouraged by support messages 
from around the world. “We are regaining our 
enthusiasm rather than giving in,” he says. m 


IN FOCUS | NEWS 


Europe makes do without NASA 


US budget crisis forces European Space Agency to abandon plans for joint mission. 


BY EUGENIE SAMUEL REICH 


r Nhe European Space Agency (ESA) is 
pushing ahead without NASA support 
for its next big science mission, as the 

ongoing US budget crunch and competing pri- 
orities impose serious constraints on the US 
space agency (see Nature 471, 278; 2011). ESA 
last week told leaders of three large, or ‘L-class, 
missions that are competing for funding to 
revise their proposals by leaving out the sub- 
stantial US contribution that had previously 
been assumed. 

“The decision was made very reluctantly,” 
says David Southwood, director of science and 
robotic exploration at ESA. “NASA could not 
meet our timetable to launch” 

The budget available to the winning L-class 
mission, set to launch in 2020, will be cut by 
40-50% to a round figure of €700 million 
(US$996 million). The competing missions are 
the International X-ray Observatory (IXO), a 
telescope that would observe black holes and 


the formation of galaxies; the Laser Interfer- 
ometer Space Antenna (LISA), which would 
detect gravitational waves; and the Europa 
Jupiter System Mission (EJSM-Laplace), in 
which twin probes would head to Jupiter’s 
icy moons Europa and Ganymede. ESA had 
planned to choose a winner in June, but it has 
now put off the decision until February 2012 
while teams rework their proposals. 

The reduced budget will affect the science 
capabilities of each mission, but team leaders 
are optimistic. Joel Bregman, a member of the 
science definition team for IXO, says that even 
if the telescope is built with a smaller mirror 
than planned and only two or three instru- 
ments instead of the proposed six, it will still 
bea step ahead of current X-ray missions. And 
Karsten Danzmann, 


European co-chairman NATURE.COM 
ofthe LISA international For more on ESAs 
science team, saysthata — proposed L-class 
more modest LISA could _ missions, visit: 


still detect gravitational —_go.nature.com/ilu2m9 


waves, but would be sensitive to waves emanat- 
ing from fewer kinds of astrophysical source. 

Ronald Greeley, US co-chairman of the 
EJSM-Laplace science team, says that his group's 
mission may be the easiest to rework, because it 
involves two spacecraft. The NASA-led Europa 
probe could be dropped — although the mis- 
sion would lose much of its scientific appeal. 

NASA still hopes to contribute in smaller 
ways to ESA’s L-class mission, but it faces short- 
falls with a flagship programme of its own: 
James Green, director of NASAs planetary- 
science division, told an advisory committee 
on 16 March that under President Barack 
Obama's 2012 budget request, the agency has 
about $1 billion to commit to phase one of its 
top planetary-science priority, a sample-return 
mission to Mars. The mission is expected to 
cost far more, so it will not be able to go for- 
ward without a comparable contribution from 
ESA. A bilateral meeting between NASA and 
ESA officials is set for 29-30 March in Pasa- 
dena, California. = 
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Cattle have been released into parts of Australia’s Alpine National Park, despite a ban on grazing there. 


Australian grazing 
trial ignites debate 


Researchers question science behind controversial effort to 
examine cattle’s role in controlling bushfires. 


BY EMMA MARRIS 


o ecologists overseas, the invitation 

| might sound tempting. It offers travel 

to Australia and unspecified remunera- 

tion to serve on an advisory panel considering 

a juicy scientific question: could allowing 

cattle to graze in the country’s Alpine National 

Park — the picturesque setting for the film The 

Man from Snowy River — reduce the risk of 

bushfires? Those responding to the call, issued 

by the federal state of Victoria at the beginning 

of March, might not realize that it also involves 

walking into a serious stoush — that’s Austral- 
ian for fight. 

More than 100 Australian ecologists have 
signed a letter to the Australian government 
denouncing the trial. The letter’s organizers 
claim that the trial is a naked attempt to use the 
imprimatur of science to allow cattle to graze in 
an ecologically sensitive area, and they now fear 
that international scientists ignorant of the ruse 
will be duped into lending credibility to the 
project. “It's a misuse of the word science to jus- 
tify a political decision,’ says Georgia Garrard, 
an ecologist at the University of Melbourne 
who helped to organize the protest letter. 

The decision in question is the return of 
cattle to portions of the 646,000-hectare park, 
a landscape of deep ravines, high plateaux and 
snow gum trees. The move fulfils a campaign 


pledge by Victoria's centre-right coalition, 
which was narrowly elected last November. 

In the 1950s, as many as 100,000 head of 
Aberdeen Angus and Hereford grazed in the 
park, according to Mark Coleman, president 
of the Mountain Cattlemen’s Association 
of Victoria. Gradual restrictions since then 
culminated in a park-wide grazing ban 
in 2005 after a Victoria government 
review found that cattle didn’t reduce 
the intense wildfires that can 
visit the region, but that their 
hooves, grazing and manure 
damage sensitive wetland 
ecosystems. 

Coleman says that the cattle 
do stop fires, by eating the veg- 
etation that forms potential fuel, and that 
ecologists have ignored mountain cattlemen’s 
knowledge of the land. “We've got generational 
knowledge that goes back — 150 years in my 
family. You can't buy that knowledge; you can’t 
learn it ina university.” 

In January, when 400 head of cattle were 
allowed back into the park, the government of 
Victoria announced that they were part of a 
research trial. Opponents of the move argue 
that no baseline data were taken and that the 
trial has not been designed. Designing the 
trial will be one of the advisory panel's tasks, 
says Peter Appleford, executive director of 
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forests and parks for Victoria. He defends the 
lack of baseline data, saying that, for a large 
landscape-scale trial, such data would need 
to have been kept for decades to be relevant. 
“Some of these scientists have been trained on 
small plots and don't understand landscape- 
scale experiments,’ he adds. 

Libby Rumpff, an ecologist at the University 
of Melbourne and an organizer of the scientific 
protest, disagrees. “It doesn’t matter what scale 
you are looking at, you should be collecting 
baseline data,” she says. 

The Melbourne newspaper The Sunday Age 
has charged the Victoria government with try- 
ing to blackmail the University of Melbourne 
into participating in the trial by threatening its 
state funding, a charge the government denies. 
And John DuBois, a spokesman for the uni- 
versity, says that researchers at its School of 
Land and Environment have told the govern- 
ment that the trial may amount to repetition 
of previous work. The university has no plans 
to participate directly. 

Mark Adams, an ecologist at the University 
of Sydney, has been invited by the Victoria 
government to “lead the program of research 
that will accompany” the trial, according to 
Victoria's website. Adams says that no contract 
has been signed between the university and the 
government, “and without a contract I won't 
be doing anything” If a contract is signed, he 
adds, he won't be designing a trial, but merely 
“testing methods suitable for measuring the 
impact of cattle on fuels and on ecosystem 
functions”. Adams says that he has taken a lot 
of flak for his willingness to work with Victo- 
ria, but feels that academics “have a duty to 
work with government to try to get the best 
outcomes in the public interest”. 

On 18 March, Tony Burke, Australia’s envi- 
ronment minister, demanded that the state 
ask the federal government for approval to 

release the cattle. The demand means 
that the cattle must be removed from 
the park until the federal govern- 
ment has reviewed the matter. But 
even without this demand, the 
animals’ time in the highlands 
is almost up. Winter snows 
will arrive soon, and the cattle 
would have been taken down 
from the mountains by mid- 
April anyway. Furthermore, it is 
not clear what the federal decision means for 
the “excellent international candidates” who 
Appleford says have already applied to sit on 
the advisory panel. 

The point of the episode, says Garrard, is 
“that science shouldnt be misused for political 
gain’. Coleman sees both sides as politicized. 
The ecologists have been “indoctrinated” by 
politicians on the left, he says, whereas the coa- 
lition is trying to deliver on a political promise. 
“Ninety per cent of the general public couldn't 
give a shit about the cattle or the environment,’ 
he says. “At the end of the day, it is all politics?” m 
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US government scientists 
test limits of conflict rules 


Policy change has made it easier to serve on boards of scientific societies. 


BY EUGENIE SAMUEL REICH 


hen Mike McPhaden was elected 
president of the American Geo- 
physical Union (AGU) last year, he 


was delighted — but he wasn't sure he would 
be able to take up the position. McPhaden 
is an oceanographer at the Pacific Marine 
Environmental Laboratory in Seattle, Wash- 
ington, which is run by the National Oceanic 
and Atmospheric Administration (NOAA). 
Lawyers at the Department of Commerce, 
which oversees NOAA, were concerned that 
leading a scientific organization that 
lobbies the government on funding 
and policy matters would create a 
conflict of interest for McPhaden. 
“There was resistance,’ he says. 

In the end, McPhaden convinced 
the agency that taking up the posi- 
tion would bring prestige to his 
government role and enhance the 
credibility of NOAA science. Today, 
a memorandum of understanding 
between the AGU and NOAA even 
allows him to spend some of his 
government-paid time working 
for the scientific society, although 
he has to recuse himself from both 
fund-raising and lobbying. 

Now, changes in US govern- 
ment policy should make it much 
easier for government scientists to serve in 
scientific societies. A memorandum on sci- 
entific integrity issued by the Office of Science 
and Technology Policy (OSTP) in December 
explicitly encourages government scientists 
to get involved with societies; previously, the 
government tended to view such associations 
ambivalently or negatively. Yet many govern- 
ment scientists affected by the policy change 
say that serious legal and ethical pitfalls remain. 

Unlike in countries such as Britain, which 
has no rule against government scientists 
serving on society boards, the strict conflict- 
of-interest rules in the United States can cre- 
ate administrative barriers for government 
scientists trying to participate in societies that 

are relevant to their dis- 


D NATURE.COM ciplines. Under US law, 
For more on the for example, govern- 
OSTP guidance,see ment officials are barred 
go.nature.com/terSuj from participating in 


matters in which they or organizations they 
are associated with have a financial interest. In 
some cases, the restrictions have been inter- 
preted as preventing government employees 
from lobbying. Employees who join outside 
organizations will have to be careful not to 
run afoul of these rules, notes John Fitzgerald, 
policy director of the Society for Conserva- 
tion Biology in Washington DC. Fitzgerald 
supports the more permissive policy but cau- 
tions that a government scientist who lobbies 
Congress could be “skating on thin ice”. 

A scientist who has navigated that issue 


William Talman takes leave from his government job to lobby Congress. 


is William Talman, president of the Federa- 
tion of American Societies for Experimental 
Biology in Washington DC, who also works 
as a physician at a hospital in lowa City run by 
the US government’s Department of Veterans 
Affairs. Acting as president of the federation, 
Talman has written to senators to advocate for 
generous funding for the National Institutes 
of Health and testified in Congress in support 
of budget boosts for the US National Science 
Foundation. Talman says that he takes unpaid 
leave from his government job for those activi- 
ties, an arrangement that happens to fit with 
the new policy. As a result, he says, “I would 
argue there's no conflict of interest.” 

Biologist Gabriela Chavarria, who is science 
adviser to the director of the US Fish and Wild- 
life Service, worried about a different issue when 
the Society for Conservation Biology invited her 
to serve on its board earlier this month. Fol- 
lowing discussions with the agency, Chavarria 


turned down the offer. The Fish and Wildlife 
Service gives the society a few thousand dol- 
lars each year to spend on scientific meetings. 
She was worried that some might think she got 
the prestigious board position in exchange for 
ensuring that the funding continued, and her 
colleagues shared her concerns. “At the end of 
the day it’s about credibility,’ she says. 


DIFFERENT EMPHASIS 

The US Geological Survey (USGS) has a more 
liberal approach. USGS scientists are encour- 
aged to serve in scientific societies, and 91 
currently do so. Their promotional 
prospects depend on them showing 
leadership in the research commu- 
nity — which they can do by being 
elected to a society board. However, 
USGS director Marcia McNutt says 
that conflicts of interest are less 
likely to arise at the USGS because 
the agency has no policy-making 
authority. 

Both the Fish and Wildlife Service 
and the USGS are run by the US 
Department of the Interior (DOI), 
and McNutt is pleased that the 
department has now consolidated its 
scientific-integrity policy. The new 
policy, introduced on 1 February, fol- 
lows the OSTP guidance and explic- 
itly encourages all researchers within 
the DOT’s jurisdiction to participate in scientific 
societies, although they need to fill out forms 
before going ahead. That paperwork helps to 
ensure that researchers understand what kinds 
of behaviour could be considered a conflict — 
for example, serving on the board of a society 
and then signing a government purchase order 
for the society’s publications. “Scientists can be 
clueless about the trouble they can get them- 
selves into,” says McNutt. 

Various other agencies are now expected 
to work the OSTP guidance into their poli- 
cies, and watchdog groups are delighted at the 
changing attitude. Jeff Ruch, executive direc- 
tor of Public Employees for Environmental 
Responsibility in Washington DC, says that 
government lawyers have long been allowed 
to participate in professional organizations 
such as the American Bar Association, and 
says he doesn't see why the practice shouldn't 
be extended to scientists. m 
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Genome builders 
face the competition 


Three independent projects seek to contrast approaches 
in preparation for routine analysis of genetic data. 


BY ERIKA CHECK HAYDEN 


is no longer difficult: the challenge is in 

assembling a full genome from the multi- 
tude of short, overlapping snippets that second- 
generation sequencing machines churn out. 
Researchers can call on any of two dozen com- 
puter programs to do the job, but all have their 
flaws. With genome sequencing fast becom- 
ing standard practice across the life sciences, 
researchers want to know how to choose. 

The answer may come from three separate 
genome-assembly projects, each of which aims 
to test different algorithms on batches of raw 
sequence data and to compare the results. 

There wont be any single ‘winner; research- 
ers stress. There is no consensus way to deter- 
mine the absolute quality 
of a genome, and differ- 
ent assemblers might doa 
better job of handling dif- 
ferent types of data. 

“My dream is that a 
few years from now, a 
person who is about to 
do a genome project 
will be able to say, “This 
is our budget, these are 
the characteristics of our 
genome; what is the com- 
bination of sequencing 
technologies and genome-assembly program 
that best fits our project?” says Ian Korf at the 
University of California, Davis, who helped to 
organize the Assemblathon, one of the three 
genome-assembly evaluation projects. 

Last December, the Assemblathon released 
a computer-generated human genome data 
set. Scientists were invited to use their assem- 
bler of choice to stitch the data into a genome. 
Seventeen teams from seven countries took up 
the challenge. Korf’s team then evaluated the 
assemblies on the basis of commonly used crite- 
ria for the quality of genome assemblies — such 
as the portion of the genome that is assembled 
into large chunks of DNA, or contigs — as well 
as less-common measurements, such as how 
many genes each assembly is able to capture. 

Ata meeting last week at the University of 
California, Santa Cruz, three winners emerged: 
ALLPATHS-LG, developed by the Broad 


Gemine DNA on an industrial scale 


Test case: researchers will evaluate 
programs assembling this bee’s genome. 


Institute in Cambridge, Massachusetts; ABySS, 
developed at the British Columbia Cancer 
Agency’s Genome Sciences Centre in Van- 
couver, Canada; and SOAPdenovo, developed 
by the Beijing Genomics Institute. But, Korf 
notes, “it’s not just the software, it’s how people 
are running it” that determines the quality of 
each assembly. 

A similar genome-assembly project called 
dnGASP has been organized by the National 
Center for Genome Analysis in Barcelona, 
Spain. Its results are set to be discussed at a 
workshop on 4-7 April. 

A third project, led by Steven Salzberg of the 
University of Maryland, College Park, is evalu- 
ating just five assemblers, among them ALL- 
PATHS-LG and SOA Pdenovo. Salzberg’s group 
will perform and evaluate all the assemblies. In 
addition, the researchers 
will use real genome data 
from four species, includ- 
ing the Argentine ant 
and the common eastern 
bumblebee. “With purely 
simulated data, you don't 
get a realistic picture of 
how these assemblers 
perform,’ says Salzberg. 

Later this year, the 
Assemblathon will launch 
another round of evalua- 
tion, comparing efforts to 
assemble two previously unreleased genomes, 
that ofa parrot and a cichlid fish. And although 
the three current efforts are focused on data 
generated by the popular Illumina sequencers, 
new sequencing methods could become 
commercially available as early as next year. 

Their output will differ from that of the Ilu- 
mina machines; the single molecule, real-time 
(SMRT) technology developed by Pacific Bio- 
sciences of Menlo Park, California, for instance, 
produces longer reads but has higher error rates 
(see Nature 470, 155; 2011). This creates a new 
challenge, says Gene Robinson, an entomologist 
at the University of Illinois at Urbana-Cham- 
paign, whose bee sequence data are being used 
by the University of Maryland project. “Biolo- 
gists really want assembly algorithms that can 
make use of multiple forms of reads and build 
the best possible assembly,’ Robinson says. 

The contest is just beginning. = 
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MISSING THE 


MARK 


Why is it so hard to find a test to predict cancer? 


n3 March, two studies appeared online 

that offered 19 pages of gloomy reading 

for anyone interested in cancer. They 

focused on biological molecules, or 
biomarkers, the presence of which in the blood 
might be used to detect the earliest glimmers of 
ovarian cancer — a disease not normally dis- 
covered until it has destroyed the ovaries and 
rotted other parts of the body. The researchers, 
coordinated by the Early Detection Research 
Network (EDRN) of the US National Cancer 
Institute (NCI), had assembled 35 protein bio- 
markers, including 5 panels of proteins, that 
had looked the most promising in early stud- 
ies. They had carried out rigorous testing — 
screening blood samples from more than 1,000 


BY LIZZIE BUCHEN 


women — to ask whether these seemingly 
breakthrough biomarkers were better at iden- 
tifying women with early ovarian cancer than 
the one flawed biomarker that had been in use 
for almost 30 years, CA-125. None of them 
was’. “CA-125 remains the ‘best of a bad 
lot,” read an accompanying perspective arti- 
cle*. “The new candidates have fallen short of 
expectations.” 

Tied in last place for its poor performance 
among the biomarker panels was one iden- 
tified by Gil Mor, a cancer biologist at Yale 
University in New Haven, Connecticut. Mor’s 
six-protein panel detected ovarian cancer in 
only 34% of the women who were diagnosed 
with the disease within a year. (CA-125, by 
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contrast, detected 63%.) Mor’s panel already 
had a tortured history. A primary research 
paper behind it had been criticized by other 
scientists for allegedly using inappropriate 
statistical calculations and for optimistically 
concluding that the test would help women 
before rigorous follow-up studies proved that it 
could. Yet for four months in 2008, the test was 
sold to patients by Laboratory Corporation 
of America (LabCorp) in Burlington, North 
Carolina, the company that licensed the panel 
from Yale. LabCorp had marketed the test 
under the name OvaSure until the US Food 
and Drug Administration (FDA) intervened 
and the company pulled it from the market. 
The panel offered “invaluable object lessons” 


Biomarkers might help to detect ovarian tumours 
(large round masses) early in the disease. 


for bringing a test prematurely to the clinic, 
wrote the authors of the perspective article. 

Similar lessons can be found in the sto- 
ries behind many cancer biomarkers that 
have sputtered and failed on their way to the 
clinic. Those tests that are in clinical use — 
including prostate-specific antigen (PSA) for 
prostate cancer, mammogram-detected 
masses for breast cancer and CA-125 — fail 
to detect all cancers and sometimes ‘detect’ 
ones that aren't there. Genomics, proteomics 
and other such technologies promised to help 
by finding combinations of markers that are 
more powerful and cancer-specific than indi- 
vidual ones, but that promise has not been 
realized. Researchers using such technologies 
have published studies on thousands of pan- 
els, suggesting that they can detect early-stage 
disease, guide patient treatment and monitor 
recurrence. But only a tiny number of such 
tests have reached the clinic — and none for 
the early detection of cancer, the biggest clini- 
cal challenge of all. “Much biomarker research 
has been done very badly for decades,” believes 
Lisa McShane, a biostatistician at the NCI in 
Rockville, Maryland. “Even when it was single 
markers. Now, as we're moving up to multiple 
markers, all our bad habits are coming back to 
bite us in a big way.” 

These habits have been thrown into the 
spotlight by the EDRN’s study, one of the larg- 
est and most systematic validation studies of 
biomarkers so far. It came just months after 


a high-profile decision at Duke University in 
Durham, North Carolina, to suspend clinical 
trials of a genomics-based biomarker panel 
designed to direct chemotherapy in patients 
with breast cancer. A number of scientists 
had raised concerns about the Duke group's 
data and analysis, and the trial was stopped 
after allegations came to light that the lead 
researcher, geneticist Anil Potti, had made 
false claims on his CV. Last September, the 
Institute of Medicine (IOM), part of the US 
National Academies, assembled a committee 
to discuss lessons for developing tests based 
on ‘omics’ technologies and bringing them to 
the clinic. “Why don't we have assays out there, 
with this enormous promise?” Dan Hayes, a 
breast-cancer researcher at the University of 
Michigan in Ann Arbor asked researchers at 
the first IOM committee meeting in December 
2010.“It’s either because these things just don't 
work, or because we've used sloppy science to 
test them” 

It is too early to say whether either of these 
is true: the field is still young, and faces many 
challenges. It has drawn in many cancer biolo- 
gists who are excited by the potential to translate 
their work to the clinic — but they sometimes 
lack the expertise or resources needed to pursue 
translational or clinical work. “A lot of novices 
came in. They get in without realizing that 

the problem may be more 


NATURE.COM complex than it appears,” 
Apossible way says Eleftherios Diamandis, 
forwardforcancer a clinical biochemist at the 
biomarkers: University of Toronto in 
go.nature.com/icwtue © Canada. And although most 
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experts agree that potential biomarkers for early 
cancer detection should be validated on sam- 
ples taken before diagnosis — the stage at which 
the test would be used in the clinic — that is a 
step that few groups attempt and no biomarker 
for ovarian cancer has passed, as the EDRN 
study made clear. “Sometimes the glamour of 
the technology or the sheer volume of omics 
data seem to make investigators forget basic 
scientific principles,” said McShane at the IOM 
meeting. Mor agrees that the field has faced 
problems, and that it is important for markers 
to go through a careful process of design and 
validation, as he tried to do. 

“There's been an enormous amount of hype 
and promise,’ sums up David Ransohoff, a can- 
cer epidemiologist at the University of North 
Carolina in Chapel Hill. “But after 10 or 15 years 
of intense work in these fields, there’ simply not 
a lot to show for it. It's important for the whole 
field to step back and look at what is wrong” 


MAKING A DIFFERENCE 

Mot began his career in Israel, where he trained 
as a clinician at the Hebrew University of Jeru- 
salem. But an experience in the final years 
of his oncology residency compelled him to 
change course. A young woman arrived at the 
hospital with ovarian cancer, a disease that 
kills some 140,000 women worldwide each 
year. The oncology team removed the woman's 
ovaries and put her through several rounds of 
chemotherapy, which seemed to be successful. 
But 18 months later, she was back, her body 
riddled with tumours, and she soon died. 
“Chemotherapy didn't do anything for her,” 
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Mor recalls. “She was 29. She was a beautiful 
girl. An impressive girl. A medical student. And 
I never understood what happened to her” 
Mor decided to leave medicine, which had 
been unable to save her, for research, which 
one day might. He earned a PhD studying 
ovarian cancer at the Weizmann Institute of 
Science in Rehovot, Israel, before moving to 
Yale in 1997. He went on to start a programme 


called Discovery to Cure, aiming to speed can- 
cer research to the clinic. The group began to 
build a bank of blood 
and tissue samples,  « , 
including some froma AS WE'RE 

Yale clinic for women MOVING UP 

with a high risk of ovar- 

ian cancer owing toa 0 MULTIPLE 
family history of the MARKERS, ALL 

i “There was alot OUR BAD HABITS 
of excitement around 

that time for finding ARE COMING 
proteins specific to can- BACK T0 BITE US 
cer,” says Mor. y 

In 2003, David Ward, IN A BIG WAY. 
then a geneticist at Yale, 
contacted Mor. Ward had co-founded Molec- 
ular Staging, a company in New Haven that 
had developed a ‘high-throughput’ technique 
for quantifying multiple proteins in the blood 
using arrays of antibodies". He asked whether 
he could use Mor’s samples to search for mark- 
ers of early ovarian cancer. 

Mor had never been involved with bio- 
marker research — “I do biology of cancer, 
not biomarker development,” he says — but 
he signed up, intrigued by the clinical poten- 
tial of the technology. Ward had scoured the 
literature for proteins that had been associ- 
ated with ovarian-cancer growth and malig- 
nancy, and had come up with 169 candidates. 
Using the protein-quantification technique, 
Ward’s company screened blood samples in 
Mor’s tissue bank that came from two groups: 
women with newly diagnosed ovarian can- 
cer who had been enrolled in Yale’s high-risk 
clinic, and women who had come to the hos- 
pital for routine gynaecological exams. Using 
additional cancer-patient samples, they whit- 
tled the list down to four proteins: leptin, pro- 
lactin, osteopontin and insulin-like growth 
factor II. 

Mor worked to develop an algorithm that 
could automatically classify women as hav- 
ing cancer or not, depending on levels of these 
four proteins. When the team ran a new set of 
blood samples through the algorithm, they got 
astounding results. The test showed a sensitiv- 
ity of 95% (meaning it correctly detected 95% 
of the ovarian-cancer cases) and a specific- 
ity of 95% (it erroneously classified only 5% 
of healthy people as having cancer). “I was 
delighted,” says Mor. On equivalent samples, 
CA-125 tests typically have a sensitivity of 
70-80% and a specificity of around 95%. In 
May 2005, the findings were published in the 
Proceedings of the National Academy of Sciences 


(PNAS), with Ward as a contributing author’. 

Before publication, Mor helped the Yale 
Office of Cooperative Research to prepare 
a patent application. “A lot of companies 
expressed interest in licensing the panel,’ 
says John Puziss, director of technology 
licensing at Yale. LabCorp licensed the test 
in 2006, as did Millipore, a biomanufac- 
turing company based in Billerica, Mas- 
sachusetts. (Mor says that the royalties he 
and his co-inventors received “were not a 
significant amount”) 

The test's promising results had also caught 
the attention of researchers in the EDRN, 
who were just putting together their valida- 
tion study. Up to that point, most biomarkers 
for detecting early ovarian cancer had only 
been shown to distinguish patients with diag- 
nosed cancer from healthy controls, but they 
are intended to detect the disease in women 
whose cancer is just budding, before symp- 
toms develop. What the field needed was a 
‘prospective’ study, run on blood samples from 
apparently healthy women, to see whether the 
biomarkers could pinpoint those who would 
later be diagnosed with ovarian cancer. Such 
samples, from large numbers of women who 
are tracked over months or years, are extremely 
difficult to come by. 


PROBLEM DETECTION 

The EDRN found what was needed in the 
Prostate, Lung, Colorectal, and Ovarian 
(PLCO) Cancer Screening Trial, sponsored 
and run by the NCI. Between 1992 and 2001, 
the trial had been collecting blood at regu- 
lar intervals from 155,000 women and men, 
and screening them for cancer. By June 2006, 
118 of the women had developed ovarian 
or closely related cancers, and the EDRN 
researchers were now ina position to use them 
to evaluate the most promising biomarkers 
for early detection. Ziding Feng, a biostatisti- 
cian at the Fred Hutchinson Cancer Research 
Center (FHCRC) in Seattle, Washington, and 
coordinator of the EDRN, visited Mor to dis- 
cuss whether his panel of four proteins could 
be included in the study. 

Mor was already in the process of refin- 
ing the panel: he had more patient samples, 
and wanted to add more markers, including 
CA-125 and the protein macrophage migra- 
tion inhibitory factor, to make the test more 
sensitive to cancer. LabCorp had been run- 
ning his new samples on assay kits manufac- 
tured by Millipore. (Ward, meanwhile, had 
moved to the Nevada Cancer Institute in Las 
Vegas, and was not involved in data collection 
or analysis.) 

When Mor showed Feng how he was ana- 
lysing his recent data, Feng was troubled. Mor 
asked him to go through the new results him- 
self, and Feng agreed to collaborate. “I do not 
do statistics,” says Mor. “That is not my field.” 
The researchers also added the six-protein 
panel to the EDRN’s validation study. 
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Feng and Gary Longton, another statistician 
at the FHCRC, developed their own classifica- 
tion algorithms, and found that Mor's test had a 
sensitivity of 95% and specificity of 99%. They 
also calculated the positive predictive value 
(PPV) of the test — the proportion of patients 
who the test would diagnose with the disease 
and do in fact have it. A high PPV means that 
few people will be misdiagnosed, which is cru- 
cial when screening healthy people. 

Feng and Longton calculated the PPV at 
6.5%, too low for the test to be of much use for 
screening. But separately, Mor was working with 
a different figure, of 99.3%. The huge disparity 
between the two values stemmed from the way 
that they calculated the figure and factored in 
the prevalence of ovarian cancer — an impor- 
tant variable in calculating the PPV. Following 
convention, Feng and Longton calculated the 
PPV using the accepted prevalence in post- 
menopausal women, | in 2,500 (0.04%). But 
Mor's figure was calculated solely from the study 
population, in which the prevalence was 46%. 
“We calculated the PPV based on the popula- 
tion in the study, because we always intended 
the test for the high-risk population,” says Mor. 
“If you want to bring the test to the clinic, it 
has to be calculated based on the population 
youre going to study,’ he says, noting that other 
research studies work out the PPV for the study 
population in this way. 

It’s acommon mistake, believes McShane, 
who — like other statisticians — disagrees with 
Mor’s logic. “I see that a lot, but it is nowhere 
near the correct thing to do,” she says. Even 
in high-risk populations — women who are 
screened every year because of their family his- 
tory or because they have tested positive for 
mutations in tumour-suppressor genes BRCA1 
or BRCA2 — the prevalence is around 0.5%, 


far below the 46% in 
“IPSIMPORTANT tar bates over the 
FOR THE WHOLE correct use of statis- 
FIELD T0 STEP tics litter the cancer- 
BACK AND LOOK 

AT WHATIS 


biomarker field, said 

researchers at the IOM 

meeting last year. “It’s 

WRONG,” the type of thing where 

: non-statisticians think 

statisticians are being 

uptight about something that’s not going to 
matter anyway, says McShane. 

Mor prepared a paper reporting the latest 
work. But when Feng and Longton saw the 
page proofs, they noticed that the PPV value 
was reported as 99.3%. They asked Mor to 
change it to the 6.5% that they had calcu- 
lated, and to correct a few other typographical 
errors in the tables. “He agreed, so we signed 
off? recalls Feng. But there was a miscommu- 
nication: Mor thought that Feng had agreed 
to the use of the high PPV, and that everyone 
approved of the final manuscript. 

The paper was published online in Clini- 
cal Cancer Research’ in February 2008, and to 
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Gil Mor is testing whether a panel of six proteins can detect ovarian cancer in women at high risk. 


Feng’s shock it reported the high PPV. “You can 
imagine how upset I was when I saw it in the 
paper,’ says Feng. 

Feng called Mor. “I told him, those are 
errors, we told you those are not correct.’ Feng 
also contacted the journal, the editor of which 
asked Mor to submit a correction to fix the 
PPV and the other typos. Mor agreed, adding 
the lower PPV asa footnote to the table and in 
a written correction. 

A few weeks later, Feng received an e-mail 
with unwelcome news from a colleague: Lab- 
Corp was preparing to market the panel, and 
was “hopeful that this test will be available to 
women by the end of the year”. 

“I was shocked,’ says Feng. “I had no idea 
this was coming.” He thought that the mark- 
ers should be validated further before they 
went to the clinic. In March 2008, Feng and 
Mor saw each other at a meeting in Washing- 
ton DC. “I told him, face to face, you cannot 
do this,” says Feng. “You have to wait until 
after the PLCO validation. What you have 
done is early discovery. If validation does not 
support your earlier claim, you're making a 
significant error.’ Mor does not recall this 
encounter, but says that Feng’s “role was to 
analyse the data, not to make judgements of 
acompany decision’. 

Now, Mor says that if he were preparing the 
paper again, he would include both the low 
and high values for the PPV. And he vacillates 
about whether LabCorp’s decision to offer the 
test to women before it had undergone more 
validation studies was the right thing to do. 


He says he thought that clinical use of the test 
might be a good way to do further validation. 
“It’s very difficult to do that on large numbers 
of patients,” he says. “It’s extremely expensive. 
The only way to do the study is if LabCorp 
started distributing the test and enrolling 
patients.’ Mor notes that many tests, such as 
mammography, have been offered to patients 
as an aid to diagnosis even while data on the 
test are being collected. “Was it the right time? 
I don't know,’ he says. 


CRITICAL BACKLASH 

On 23 June 2008, LabCorp announced the 
availability of the OvaSure test, for between 
US$220 and $240. The press release said that 
it was being offered to women with a high risk 
of the disease, and quoted Mor as saying he 
was “pleased that this test is available to help 
physicians detect and treat ovarian cancer in 
its earliest stages”. 

Excited chatter about the test spread through 
patient forums and support groups, but it 
was soon countered by cautionary tales. Jean 
McKibben, an ovarian-cancer survivor, 
rushed to take OvaSure on the first day it was 
available, and her results showed a 0.00 chance 
of cancer. A week later, scans showed that her 
cancer was back. She was crushed. “I wanted 
this to work so badly,’ she wrote on a discus- 
sion board. 

One week after LabCorp’s announce- 
ment, the Society of Gynecologic Oncolo- 
gists in Chicago, Illinois, released a statement 
expressing concern about OvaSure, saying 
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that “additional research is needed to vali- 
date the test’s effectiveness”. The paper in 
Clinical Cancer Research was also circulat- 
ing at the Canary Foundation, a non-profit 
organization based in Palo Alto, California, 
that funds research on early cancer detec- 
tion. Scientists there found other reasons for 
concern. One member, Nicole Urban, head of 
the Gynecologic Cancer Research Program at 
the FHCRC, had found that levels of prolac- 
tin, one of the proteins in the panel, are highly 
sensitive to stress — something very likely to 
affect women entering the clinic with symp- 
toms of ovarian cancer’. After controlling for 
that, she says, “prolactin gave no signal at all 
for malignancy. It was useless.” Others pointed 
out that the high specificity and sensitivity fig- 
ures reported in the paper’s conclusions, and 
trumpeted in Yale and OvaSure press releases, 
were not present in any of the tables or fig- 
ures. And they bristled at the positive tone of 
the discussion, which stated that the test “will 
enhance the potential of treating ovarian can- 
cer in its early stages and therefore, increases 
the successful treatment of the disease”. 

“There were a lot of uncertainties, and 
evidence of biases,” says Martin McIntosh, 
who researches markers for early-stage ovar- 
ian cancer at the FHCRC, and is a member 
of the Canary group, “But the narrative only 
highlighted the best-performing analysis. 
It didn’t mention caveats.” Members of the 
Canary group wrote a letter to Clinical Cancer 
Research, describing some of their complaints. 
Meanwhile, Feng agreed to co-author a second 
letter, criticizing the paper even though he was 
a co-author. 

The fuss was already reaching the FDA, 
which on 7 August 2008 sent a letter to Lab- 
Corp saying that the test “has not received 
adequate clinical validation, and may harm the 
public health”. A second letter, sent by the FDA 
on 29 September 2008, alleged that LabCorp 
did not have the necessary marketing clear- 
ance or approval for the test from the FDA. 
LabCorp replied to the FDA on 20 October, 
disagreeing with the agency’s assertions, but 
agreed to pull OvaSure from the market. It 
did so on 24 October 2008, just one day after 
Clinical Cancer Research published the critical 
letters from the Canary Foundation and Feng, 
as well as a third from the Centers for Disease 
Control and Prevention (CDC) in Atlanta, 
Georgia*”®. (Millipore continues to market 
the biomarker panel for use in research, not 
by patients.) 

Mor was surprised by all three letters. In 
his published response"', he disputed some 
of the criticisms and wrote that any concerns 
about commercialization should be taken 
up with LabCorp. Stephen Anderson, vice- 
president of investor relations at LabCorp, 
says that OvaSure was not marketed as a test 
for detecting cancer recurrence, which was 
how some patients used it. He says that Lab- 
Corp “continues to believe OvaSure offers a 
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CASE STUDY 


The gene collection that could 


Genomic signatures measured in tissue samples can help to classify breast-cancer tumours. 


When asked to name a 
successful cancer test that is 
based on multiple genes or 
proteins, many researchers 
point to Oncotype DX. The 
panel, which tests surgically 
removed breast tumours 
for the expression level of 
21 genes to predict the 
likelihood of the cancer 
recurring, was developed 

by Genomic Health in 
Redwood City, California, 
and has been marketed 
since January 2004. It is 
used by roughly half the 
patients in the United States 
who have the most common 
type of breast cancer. 

“A lot of biomarker 
research starts with getting 
some signature and then 
figuring out how to use 
it,” says Richard Simon, a 
biostatistician at the US 
National Cancer Institute 
in Rockville, Maryland. But 
Genomic Health researchers 
took a different route, 
sitting down in 2000 with 
a group of oncologists 
and patient advocates 
and asking what question 
in cancer treatment they 
should address. Two years 
later, they nailed it down. 

At present, the majority 


of women with the most 
common type of early 
breast cancer undergo 
chemotherapy after surgery, 
but only 15% are likely to 
have a recurrence. Was 
it possible to identify the 
crucial 15%, and spare the 
others from chemotherapy? 
The team set out to find a 
gene signature that could 
do the job. “The number 
one key to their success was 
starting with a well-defined, 
clinically relevant question,” 
says Simon. 

The researchers brought 
in Michael Walker, a 
statistician then based in 
Sunnyvale, California, to 
help design the studies from 
the outset. Walker says that 
it rarely works this way, and 
that often the statistician 
is only brought in after the 
data have been collected. 
By that time, biases and 
confounding factors may 
be hard-wired into the data. 
The team also puta high 
priority on using the right 
tissue samples in their 
initial studies. They decided 
early on that they wanted 
to use tumour tissue that 
had been fixed in formalin 
and embedded in paraffin 


— the way itis prepared by 
the pathology lab after a 
tumour is removed — in the 
clinic and in all clinical trials. 

Because of this, the team 
was able to validate the 
panel using samples that 
had already been collected 
in large clinical trials, rather 
than having to collect 
samples afresh. 

“It’s a poster child for one 
way to do clinical research,” 
says David Ransohoff, a 
cancer epidemiologist at 
the University of North 
Carolina in Chapel Hill. 

But the test is hardly 
perfect. In January 2008, 
a group commissioned by 
the Centers for Disease 
Control and Prevention in 
Atlanta, Georgia, evaluated 
Oncotype Dx. It found 
that the test results were 
reproducible and did well 
at predicting recurrence, 
but it was unclear whether 
the test was better than 
established risk factors, 
such as age, or standard 
molecular features of the 
tumour'’. Results from 

a large, independent 
validation study, called 
TAILORx, are expected in 
2015. LB. 


valuable tool for ovarian-cancer detection 
in conjunction with other diagnostic tech- 
niques’, and that the assay is still in develop- 
ment. The company would not provide 
further comment. 


DOUBTS AND LESSONS 

Since then, Mor has worked hard to validate 
his panel. He and Ward have completed a 
study on a much larger set of samples includ- 
ing many from women diagnosed in the 


earliest stages of ovarian cancer’, and in 
which LabCorp again ran the assays. The 
test still performed well at distinguishing 
the patients from the healthy controls. Mor 
says he is puzzled by the PLCO trial results, 
and he hopes that further analysis of the trial 
data will help to explain why his biomarkers 
performed so poorly. He continues to express 
confidence in his panel, saying that the test 
could be most useful in high-risk populations, 
and when used regularly — every two to three 
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months — to monitor rising and falling levels 
of the biomarkers. But the whole experience 
has made him reluctant to pursue biomarker 
work much further. “I’m focusing on under- 
standing cancer stem cells,” he says. 

Others say that’s just as well. The panel's poor 
performance in the PLCO study makes critics 
question its usefulness in any group, even a 
high-risk one. McIntosh says that the PLCO 
study’s damning conclusions should serve as a 
wake-up call. “The entire field has to cope with 
this,” he says — including him, given that the 
most promising biomarkers discovered by his 
institution also failed to improve on CA-125 in 
the trial. “It’s hugely disappointing” 

The IOM committee, which is expected to 
release its results sometime in 2012, may help 
to find a way forward. At a meeting later this 
month, the members plan to draw lessons from 
the biomarker failures, as well as from the few 
success stories (see “The gene collection that 
could’). One of the most urgent lessons is the 
need to help researchers validate their bio- 
markers on appropriate samples before they 
reach the clinic. Feng says that the EDRN has 
been collecting its own high-quality tissue ref- 
erence sets for ovarian, breast, lung, colon, liver 
and prostate cancers, from people who arent 
yet showing symptoms and those in all stages 
of the disease. Investigators can apply to test 
their biomarkers on blinded tissue samples. 

Until this type of testing becomes common- 
place, there is no way of excluding the possibil- 
ity that, as Hayes suggested at the IOM meeting, 
“these things just don’t work” — particularly 
when it comes to picking up cancer early on. 

“People keep talking about early-detection 
biomarkers as if they are a fact, and we only 
need to find them,” says McIntosh, “when in 
reality their existence is a hypothesis that needs 
to be tested.” m SEE OUTLOOK P.450 


Lizzie Buchen is a freelance writer in San 
Francisco, California. 
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THE DARK. 


The race to detect dark matter has yielded mostly confusion. But the larger, 
more sensitive detectors being built could change that picture soon. 


ora substance that is utterly invisible, 
dark matter does a remarkably good 
job of making its presence felt. Astronomers have 
been compiling evidence for it since the 1930s, tracing 
how it shapes galaxies, galaxy clusters and even bigger 
cosmic structures by the inexorable force of its gravity. 
Although its real nature is unknown, dark matter seems to outweigh the 
ordinary matter visible in stars and galaxies by roughly 5.5 to 1. 

Down here on Earth, however, physicists struggling to answer the 
‘what is it?’ question often feel like they’re chasing a ghost. Certainly, 
their detectors have been giving them a lot of strange and contradictory 
results. Two experiments are independently seeing what seems to be a 
flux of dark matter streaming through their apparatus. Another detector 
may have seen a handful of dark-matter particles last year — although 
the experimenters dismiss them as background noise. And yet another 
experiment has found no evidence for dark matter at all. 

Fortunately, this confusion is likely to be temporary. Dark-matter 
detectors are roughly 1,000 times more sensitive to ultra-rare events 
than they were 20 years ago, and that should increase by another factor 
of 100 over the next decade, as physicists build bigger detectors and 
become more skilled at suppressing the background noise than can be 
confused with genuine signals (See ‘Dark-matter detectors’). “It would 
not be surprising if a year from now someone stood up and said we have 
done it, we've detected dark matter,’ says Sean Carroll, a theoretical 
physicist at the California Institute of Technology in Pasadena. Other 
physicists give a more cautious estimate of five to ten years. Nonetheless, 
there is a palpable sense that the field is on the verge of something big. 

Most of the attempts to detect dark matter directly have started from 
the assumption that the stuff is a haze of weakly interacting, massive 
particles (WIMPs) left over from the Big Bang. The ‘massive’ part would 
explain the gravity. And the ‘weakly interact- 
ing’ part would explain the invisibility: the 
WIMPs would flow through stars, planets 
and people in untold numbers, almost never 
hitting anything. 

That assumption dictates the basic detec- 
tion strategy: bring together a large target 


X-rays (pink) reveal 
ordinary matter ina 
galaxy cluster, and 
gravitational lensing of 
background galaxies 
(blue) maps dark matter. 


BY ADAM MANN 


mass of material; put it deep underground to shield it 
from cosmic rays and other radiation that could produce 
misleading signals; then measure the recoil energy when a dark-matter 
particle finally hits an ordinary nucleus. The larger the mass of mate- 
rial, the more likely it is that a dark-matter particle will hit something. 

Beyond those basics, setting up such an experiment requires a certain 
amount of guesswork. To have a significant recoil effect, for example, 
researchers need a target nucleus of roughly the same mass as the dark- 
matter particle they are seeking. It’s like watching for an invisible pool 
ball, says Jonathan Feng, a particle physicist at the University of Cali- 
fornia, Irvine. If the target nucleus is the equivalent of a bowling ball, 
the impact will barely move it. If, on the other hand, the target is the 
equivalent of a ping-pong ball, it will hardly be capable of deflecting the 
dark-matter particle, and so again there will be little energy transferred. 
What you want is another pool ball, Feng says. 


SUPERSYMMETRICAL WIMPS 

Several dark-matter experiments have placed their bets on super- 
symmetry: a theory in which each particle in the standard model of 
physics would have a heavier, and so far unobserved, partner’. Super- 
symmetry predicts the existence of a WIMP called a neutralino, which 
would have exactly the right properties to account for the dark-mat- 
ter distribution seen in the Universe. Its interactions would be feeble 
enough, yet its mass would be substantial — fifty to a few thousand 
times the mass of a proton. 

One of the most highly regarded of the neutralino detection efforts 
is the XENON Dark Matter Search Experiment, located in the under- 
ground portion of the Gran Sasso National Laboratory near LAquila, 
Italy, and operated by a consortium of US and European universities. As 
its name suggests, the experiment’s detection medium is a tank of liquid 
xenon, which has a mass of just over 131 atomic mass units — close 
to ideal for detecting WIMPs at the lighter end of the supersymmetry 
range, which is by far the easiest place to start the search. 

Photomultiplier tubes lining the inside of the XENON tank look 
for the characteristic flash of light called scintillation that would be 
generated if a xenon atom had recoiled from the impact of a WIMP. 
The XENON collaboration’s first detector, built in 2006, used around 


24 MARCH 2011]! VOL 471 | NATURE | 433 


© 2011 Macmillan Publishers Limited. All rights reserved 


X-RAY: NASA/CXC/CFA/M. MARKEVITCH ET AL.; OPTICAL: NASA/STSCI; MAGELLAN/UNIV. ARIZONA/ 
D. CLOWE ET AL.; LENSING MAP: NASA/STSCI; ESO WFI; MAGELLAN/UNIV. ARIZONA/D. CLOWE ET AL. 


XENON100 COLLABORATION; DAMA/LIBRA COLLABORATION 
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DARK-MATTER DETECTORS 


LOCATION: Gran Sasso, Italy 
TARGET: 161 kg liquid xenon 
MASS RANGE: 10-100 Gev 
START DATE: 2009 

FINDINGS: No detections 


START DATE: 2008 


15 kilograms of xenon and found nothing that could not be attributed 
to background radiation. The team then upgraded to a bigger, more 
sensitive, 161-kilogram version in 2009, dubbed XENON 100. 

Although an initial 11-day data run on this detector still failed to find 
any particles’, that result was significant in itself! WIMPs with a mass of 
less than 100 gigaelectronvolts (GeV) should have shown up, says Laura 
Baudis, the physicist who leads the XENON group at the University of 
Zurich, Switzerland. Because they didnt, those lower masses could be 
ruled out. Unfortunately, the results from a subsequent, 100-day run 
remain obscure: the researchers are still struggling to deal with unex- 
pectedly high levels of background radiation caused by trace contami- 
nants in the xenon’. 


PURE AND SIMPLE 

An experiment searching a similar mass range is the Cryogenic Dark 
Matter Search (CDMS) in the disused Soudan mine in northern 
Minnesota. As a detection medium, the CDMS team uses a collection 
of germanium and silicon crystals, which are among the only solid ele- 
ments that can be made with high enough purity to be usable for detect- 
ing dark matter. When the detector is operating, these crystals — which 
are about 10 centimetres across — are cooled to a temperature of just 40 
millikelvin, so any heat associated with a WIMP impact can be detected. 

Now running its second-generation experiment, called CDMSIL, the 
collaboration generated some excitement early last year when it reported 
two detections that could be interpreted as dark-matter signals*. Despite 
the hubbub, the team is reserved. “We don't claim this is significant; we 
see a lot of events at this low threshold and most of them are plausibly 
background,’ says Jeffrey Filippini, who works on the CDMS team from 
the California Institute of Technology. If those two events are discounted, 
the CDMS team gets much the same result as the XENON collaboration: 
null findings that effectively rule out low-mass WIMPs. 

Yet the XENON and CDMS results contradict those from other 
experiments, the operators of which claim to have detected the very 
low-mass WIMPs ruled out by the first two. Perhaps the most intrigu- 
ing, and most controversial, of these experiments is the Dark Matter 
Large Sodium Iodide Bulk for Rare Processes (DAMA/LIBRA), which 
shares space with XENON at Gran Sasso. DAMA works on the prin- 
ciple that the Sun’s orbit around the centre of the Galaxy carries the 
Solar System through the invisible cosmic background of dark matter 
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DAMA/LIBRA 


LOCATION: Gran Sasso, Italy 
TARGET: 250 kg sodium iodide crystal 
MASS RANGE: Light dark matter 


FINDINGS: Annual oscillations 


Today’s dark-matter detectors are 1,000 times more sensitive than their predecessors. 
The detectors coming on line will have bigger targets, making them even more sensitive. 


LOCATION: Soudan, Minnesota 

TARGET: 230g germanium, 105g silicon crystals 
MASS RANGE: 10-100 GeV 

START DATE: 2004 

FINDINGS: Two detections. Background noise? 


at some 220 kilometres per second. So detectors on Earth should have 
dark matter flowing through them at that velocity, modulated by an 
annual variation of 30 kilometres per second as the planet orbits the Sun. 

The DAMA team, which looks for the scintillation of recoil events 
inside sodium iodide crystals, claims to have followed just such a peri- 
odic dark-matter signal for thirteen years’. However, the crystals can- 
not distinguish between WIMPs and background events from ordinary 
radiation in the detector’s surroundings, so this result depends on the 
assumption that background events occur at a constant rate that does 
not vary with the season. If that result is valid, it flies in the face of the 
XENON and CDMS findings. 

“If the main signal was as big as they claim, we and other teams 
would have seen it,” says Leo Stodolsky of the Max Planck Institute for 
Physics in Munich, Germany, who works on a collaboration called the 
Cryogenic Rare Event Search with Superconducting Thermometers 
(CRESST), also at Gran Sasso. Voicing a scepticism shared by many 
non-DAMA physicists, Stodolsky says that any number of seasonal 
processes could release subatomic particles that would mimic DAMA’s 
results for dark matter, including something as simple as snow melting 
and refreezing in the mountains above the lab. 

Further eroding DAMA’ position is that no other dark-matter 
experiment is looking for a periodic signal, so its results cannot be 
directly replicated. Yet, despite the criticisms, the DAMA signal gets 
stronger each year. “DAMA has been very courageous, says Juan Collar, 
a physicist at the University of Chicago in Illinois. “They went out and 
madea claim” when most other physicists were still inclined to dismiss 
their results as background noise. 

Collar leads an effort called Coherent Germanium Neutrino Tech- 
nology (CoGeNT), whose detector sits near the CDMSH in the Soudan 
mine. CoGeNT uses germanium crystals tuned to detect incoming par- 
ticles with much lower masses than those sought by its neighbour. It 
was originally intended to explore this range to rule out the existence of 
low-mass WIMPs, but its results only ended up making things murkier. 

Around the time that the CDMSII reported its ‘nearly nothing’ find- 
ings, CoGeNT released data from its first 56 days of operations’. The 
results showed hundreds of particle events that could be interpreted as 
dark matter with a mass between 7 and 11 GeV. 

These could also be the same particles that DAMA is detecting, but 
physicists have been quick to offer a more sober reading. “For CoGeNT, 
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CDMS COLLABORATION, FERMILAB 


COGENT COLLABORATION 


CoGeNT 


LOCATION: Soudan, Minnesota 
TARGET: 500g germanium crystal 
MASS RANGE: <10 GeV 

START DATE: 2004 

FINDINGS: Hundreds of detections 


START DATE: 2011 


the signal and the background could easily be mistaken for the same 
thing,” says David Kaplan, a physicist at Johns Hopkins University in 
Baltimore, Maryland. The team has decided to wait a full year after its 
initial publication, to see if its findings show the same seasonal fluctua- 
tion as DAMA, before announcing any new results. 


TOTAL ANNIHILATION 

Meanwhile, a debate has broken out over another way to detect dark 
matter. One of the many oddities of dark-matter particles is that they 
can be their own antiparticles: put enough of them in one place, and 
they should start annihilating one another, producing y-rays in the 
process. In particular, the centre of the Milky Way should be produc- 
ing excess y-radiation because dark matter is expected to concentrate 
there, says Dan Hooper, an astronomer at 
the Fermi National Accelerator Laboratory 
located near Batavia, Illinois. And Hooper 
claims to have found evidence for these 
y-ray excesses in data from NASAs Fermi 
Gamma-ray Space Telescope’. 

“Tf you were to ask what kind ofa signal 
you would want to see with dark matter in 
the galactic centre, this would be what you 
expect,” says Neal Weiner, a theoretical physicist at New York University. 
The results are consistent with a dark-matter particle of 7.3-9.3 GeV, a 
range that fits well with the findings from both CoGeNT and DAMA. 

Other researchers remain sceptical. “The galactic centre is so 
complicated that before you believe that you have dark-matter anni- 
hilation, you have to rule out all the options,” says Doug Finkbeiner, 
an astronomer at the Harvard-Smithsonian Center for Astrophysics in 
Cambridge, Massachusetts. Finkbeiner points out that the signal could 
come from undetected pulsars — rapidly rotating neutron stars that 
produce copious amounts of high-energy radiation. 

Still, Hooper's results have given researchers food for thought. “It’s a 
case of too many coincidences,” says Collar. When findings from three 
detectors all start to point towards dark-matter particles of similar mass, 
he says, “you start to wonder if they're not coincidences any more”. 

Such thinking has led theorists such as Feng to take a fresh look at all 
the results to see whether they can come up with a coherent idea of what 
dark matter might be. If CoGeNT and DAMA are right, says Feng, then 


LOCATION: Homestake, South Dakota 
TARGET: 350 kg liquid xenon 
MASS RANGE: 10-100 GeV 


FINDINGS: Not yet started 


“BEFORE YOU BELIEVE THAT 
YOU HAVE DARK- MATTER 


ANNIHILATION, YOU HAVE TO 
RULE OUT ALL THE OPTIONS.” 


FEATURE 


LOCATION: Kamioka, Japan 
TARGET: 1,000 kg liquid xenon 
MASS RANGE: 10-100 GeV 
START DATE: 2011 

FINDINGS: Not yet started 


they are not detecting the expected dark-matter particle, the neutralino, 
as it should not be as light and interact as strongly as the results indicate. 
So perhaps dark matter is some very different particle — or perhaps the 
model of a single WIMP is not correct. 

“If you look at the few per cent of the Universe that comprises us, it’s 
quite complex,” says Philip Schuster, a physicist at the Perimeter Institute 
for Theoretical Physics in Waterloo, Canada, referring to the known 
‘particle zoo’ predicted by the standard model that includes such oddi- 
ties as muons, neutrinos and quarks. “It’s a little insane to believe that 
the other 85% of the Universe would be so simple,’ he says. 

Along with his collaborators, Schuster is working to find evidence for 
amore complex theory of dark matter, called the ‘dark sector. This sector 
could include multiple types of dark matter and a number of dark forces, 
which, like ordinary matter, could combine 
to form dark atoms. It is being tested in an 
experiment called the A Prime Experiment 
(APEX), at the Thomas Jefferson National 
Accelerator Facility in Newport News, Vir- 
ginia, which will accelerate a high-energy 
beam of electrons and search for relatively 
heavy force-carrying particles radiating 
from them. “It might tell us that the Uni- 
verse is alot broader than we suspect,’ says Natalia Toro, a physicist who 
works at the Perimeter Institute and with Schuster on APEX. 

The good news is that both the XENON100 and CoGeNT collabora- 
tions are expected to release their first full year’s worth of data this year. 
And larger, more sensitive detectors, such as the Large Underground 
Xenon (LUX) and Xenon neutrino Mass (XMASS) detectors, are sched- 
uled to start operations not long afterwards. “While we are in a ‘he said, 
she said’ situation now, it won't be like that indefinitely,’ says Weiner. “We 
will have enough information to settle this in the next couple of years.” m 


Adam Mann is a freelance writer in Washington DC. 
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This 1961 effort to drill right through Earth’s crust didn’t succeed. Half a century on, geologists are ready to try again. 


Journey to the 
mantle of the Earth 


On the 50th anniversary of the first attempt to drillinto Earth’s mantle, Damon 
Teagle and Benoit Ildefonse say that what was once science fiction is now possible. 


etrieving a sample of Earth’s mantle 
R= been an overarching ambition 

of the geoscience community for 
more than a century. In 1909, the Croatian 
meteorologist Andrija Mohorovicié noticed 
that seismic waves travelling below about 
30 kilometres underground move faster 
than those above that depth, indicating a 
fundamental change in the composition 
and physical properties of the rocks. He had 
discovered the upper boundary of Earth's 


mantle, now known as the Mohorovi¢ié 
discontinuity, or ‘Moho’ for short. This 
boundary marks the start of the bulk of 
Earth’s interior, which extends from the base 
of Earth’s crust — at 30-60 kilometres under 
the continents but just 6 kilometres under 
the thinner crust of the oceans — to the core 
2,890 kilometres below. 

Drilling down and retrieving samples 
directly from the mantle would provide 
scientists with a treasure trove comparable to 
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the Apollo lunar rocks, giving insight into the 
origins and evolution of our planet. But this 
has proved as difficult, perhaps more difficult, 
than going to the Moon. So far, no one has 
drilled deeper than about 2 kilometres into 
the ocean crust, or a third of the way to the 
mantle. The first effort to drill into the mantle, 
‘Project Mohole; foundered in a geopolitical 
quagmire and did not achieve its goal. 

A new Mohole campaign is now under 
way, thanks to improved technology, 
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Harry Hess, a founding father of the theory of plate tectonics, explains Project Mohole. 


> a better understanding of the rocks far 
below our feet and a deeper appreciation of 
the challenges of drilling through them. Over 
the next few years, geophysical surveys will 
be conducted at three Pacific Ocean loca- 
tions that are in contention to be the site of 
the first deep hole to the mantle (see ‘Drill- 
ing sites’). Drilling down to the mantle will 
require a huge amount of ship time and will 
be very expensive — far more expensive than 
acurrent single drilling expedition, although 
far cheaper than a Moonshot. But if funding 
can be found, and the scientific commitment 
maintained, drilling could begin within the 
decade, and be completed within 15 years. In 
the meantime, next month, we will be leading 
an expedition to the Pacific to bore further 
into the oceanic crust than ever before. 


INSPIRED IDEA 
The first serious plans to drill down to the 
mantle were concocted in the late 1950s by 
a handful of post-war American geoscience 
grandees under the guise of the American 
Miscellaneous Society — an informal group 
of US National Academy of Science mem- 
bers, sometimes referred to as a ‘drinking 
club: The idea came primarily from Harry 
Hess, one of the founding fathers of the 
theory of plate tectonics, and Walter Munk, 
who pioneered studies of how winds drive 
ocean currents and explained why one 
side of the Moon is locked towards Earth. 
Frustrated by what they saw as a stream of 
worthy yet pedestrian research propos- 
als in their field, they sought to undertake 
something more ambitious and innovative. 
At a wine breakfast at Munk’s home in 
La Jolla, California, on a Saturday morning 
in April 1957, they came up with Project 
Mohole, a scheme to drill, for the first time, 
right through Earth’s crust and into the 
upper mantle. 


438 | NATURE | VOL 471 | 24 MARCH 2011 


Back then the nascent offshore petroleum 
industry had not yet begun to contemplate 
deep-water drilling. The Mohole project 
required the development of new technolo- 
gies such as dynamic positioning, which 
would allow a drill ship to keep its position 
steady. The group obtained funding from the 
US National Science Foundation and com- 
missioned the best ship available for the job: 
the drilling barge CUSS 1, named after the oil 
companies that had developed it, Continen- 
tal, Union, Shell and Superior. Within four 
years of the project’s proposal, propellers had 
been installed on the side of CUSS 1 anda 
system developed that allowed these to keep 
the ship in position. 

Between March and April 1961 scientists 
took their first core from the uppermost 
hard rock of the oceanic crust, or ‘basement; 
off Guadalupe Island in the eastern Pacific 
Ocean, thanks to the daring and innovative 
engineering efforts of Willard Bascom and 
his colleagues. From beneath 3,800 metres 
of water and 170 metres of sediment, they 
pulled up a few metres of basalt, at a cost of 
US$1.5 million (about $40 million in 2009 
dollars, in terms of its share of the total US 
economy). This remarkable accomplish- 
ment was reported in Life magazine (14 April 
1961) by the novelist and amateur ocean- 
ographer John Steinbeck, who was aboard 
CUSS 1 during these first operations. 

This was the only ocean core that Project 
Mohole succeeded in drilling. After the 
expedition, the management of the pro- 
ject changed, some poor decisions were 
made about which drilling technologies to 
pursue, and costs spiralled out of control. 
In 1966, Project Mohole collapsed when the 
US Congress voted to cancel its funding. 

Nevertheless, the project coincided with a 
growing acceptance of plate-tectonics theory. 
Interest in the formation and evolution of the 
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oceanic crust was booming. Project Mohole 
proved that scientific drilling into the ocean 
basement was possible. This contributed to 
the establishment, and continuance over 
four decades and running, of international 
collaboration in scientific ocean drilling. The 
Integrated Ocean Drilling Program (IODP; 
www.iodp.org) and its predecessors, the Deep 
Sea Drilling Project (DSDP) and the Ocean 
Drilling Program (ODP), form arguably the 
most successful, long-term international 
scientific collaboration in any field. 


THE DEEP FRONTIER 

The mantle holds some 68% of the planet 
by mass. Its sheer volume makes an accurate 
knowledge of its composition and variability 
essential for understanding how Earth was 
formed and has evolved. Almost all of Earth’s 
surface crust — the material that makes up 
the sea floor and the continents — originally 
came from the mantle. 

Some pieces of the mantle have been 
thrust up to Earth’s surface during tectonic 
mountain building, where they are avail- 
able for study. Other mantle pieces, encased 
in lava, have been ejected from volcanoes, 
and sea-floor spreading has brought some 
to the ocean floor. These pieces show that 
the mantle is composed mainly of rocks 
called peridotites, made of magnesium- 
rich, silicon-poor minerals such as olivine 
and pyroxene. They also suggest, together 
with far-field seismic measurements, that 
the mantle’s composition varies from place 
to place, but the extent of this variation 
remains unclear. The available samples have 

all been chemically 


“Drilling to altered by the pro- 
the mantle cesses that brought 
is the most them to the surface 
challenging or by exposure to sea 
endeavour in water. Concentrations 
the history of many of the key ele- 
of Earth ments and isotopic 


tracers (including 
water, uranium, tho- 
rium, lithium, carbon, 
sulphur, silicon, potassium, the noble gases 
and the iron oxidation state) that might be 
useful in reconstructing Earth’s evolution are 
highly labile. A few kilograms of fresh peri- 
dotite from beneath the crust would provide 
a wealth of new information. 

Getting to the mantle requires drilling 
through a full section of oceanic crust, which 
will also be a boon for geologists. The forma- 
tion of crust is the foundation of the plate- 
tectonic cycle. It is the main mechanism by 
which heat and material is dredged up from 
the interior of Earth, resurfacing some 60% of 
our planet every 200 million years. Currently, 
the thermal, chemical, and perhaps biological, 
exchanges occurring deep in the oceanic crust 
remain poorly understood because of the lack 
of direct observations in situ. 


science.” 


SOURCE: REF. 1 


The technology to drilla hole a few inches 
wide through 6 kilometres of crust now 
exists or is feasible to develop”. A promising 
candidate technology is on the Japanese drill- 
ing ship Chikyu, launched in 2002. The vessel 
has a riser system: an outer pipe surrounds 
the drill string — the steel pipe through 
which cores are recovered. The drilling mud 
and cuttings are returned up to the vessel in 
the space between the two pipes. This helps to 
recycle the drilling mud, control its physical 
properties and the pressure within the drill 
hole and helps to stabilize the borehole walls. 
It also means that cuttings can be evaluated 
for scientific purposes. Chikyu is a giant ship, 
capable of carrying 10 kilometres of drilling 
pipes, and is equipped for riser drilling in 
2.5 kilometres of water. 

Over the next decade, researchers and 
engineers will have to design and develop 
new drill bits, lubricants and wireline instru- 
ments to make coring into the mantle pos- 
sible, at pressures as high as 2 kilobars and 
temperatures up to 300°C, and beneath 
about 4 kilometres of water. In particular, 
this will require a riser system capable of 
going deeper than Chikyu’s current equip- 
ment, or a different mud-circulation system, 
with part of the equipment installed on the 
sea floor. 

To find the best site for the new Project 
Mohole, a large number of factors need to 
be considered. Ideally, it would be in the 
shallowest possible water overlying oceanic 
crust, which means going as close as possible 
to the mid-ocean ridge where new crust is 
formed. It should also be in the coldest pos- 
sible crust, which is away from the ridge. 
These constraints limit the possible sites 
to three — off the coasts of Hawaii, Baja 
California and Costa Rica, respectively. All 
have pros and cons. The site near Hawaii, for 
example, is the coolest, but also the deepest, 


DRILLING SITES 


and close to recent volcanic activity that 
might have chemically altered the mantle 
and perturbed the overlying crust. All the 
sites are in the Pacific Ocean because the 
crust there is formed faster than in other 
oceans, which makes for the simplest, most 
uniform basement architecture. Seismic and 
geological studies hint that fast-spreading 
ocean crust should be relatively uniform and 
conform most closely to textbook models. 
We hope to find material that conforms to 
these models of a simple crust: a layer cake 
of rock types called lavas, dikes and gabbros. 

Meanwhile, the ocean-drilling commu- 
nity continues to core as deeply into the 
crust as is feasible using the conventional, 
non-riser technology available on the drill- 
ing vessel JOIDES Resolution — the recently 
rebuilt workhorse of the IODP. 


THE LOWER CRUST 

Fifty years after John Steinbeck sailed on 
CUSS 1 with the pioneers of ocean-crust 
drilling, we are the co-chief scientists on 
an expedition to obtain for the first time 
a section of the lower oceanic crust — the 
material lying just above the mantle. IODP 
Expedition 335 is due to sail from 13 April 
until 3 June’. 

The site chosen for this mission is on the 
Cocos plate off the coast of Costa Rica (ODP 
site 1256). This site is in ocean crust that 
formed superfast — at more than 20 centi- 
metres a year, much faster than any present- 
day crust formation. That makes the upper 
crust there much thinner than elsewhere, 
so it is possible to reach the lower portions 
without having to drill very deep. Three pre- 
vious expeditions to Hole 1256D have drilled 
down to more than 1.5 kilometres below the 
sea floor, into the transition zone between 
dikes and gabbros*”. 

This spring we plan to deepen the hole 


Three areas are under consideration for drilling into the mantle. One includes the original Project Mohole 
drilling site. Another includes a site (ODP site 1256) where scientists will drill this year into the lower crust. 
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at least another 400 metres, and recover for 
the first time gabbros from the lower crust, 
which will be the deepest types of rock ever 
extracted from beneath the sea floor. (The 
hole itself is not the deepest ever drilled into 
the sea floor; that was Hole 504B, which 
reached 2,111 metres below the bottom of 
the eastern Pacific off Colombia.) 

The mission should help to settle many 
debates: how crust is formed at mid-ocean 
ridges; how magma from the mantle is 
intruded into the lower crust; the geometry 
and vigour of how sea water can pull heat 
from the lower oceanic crust; and the contri- 
bution of the lower crust to marine magnetic 
anomalies. We will still be 3.5 kilometres 
shy of the Moho, but the project will pro- 
vide further impetus for, and confidence in, 
deep ocean crust drilling, as well as crucial 
information for planning a full penetration 
through the Moho and into the upper mantle. 

Drilling to the mantle is the most 
challenging endeavour in the history of 
Earth science. It will provide a legacy of 
fundamental scientific knowledge, and 
inspiration and training for the next genera- 
tion of geoscientists, engineers and technol- 
ogists. There is a surprising level of interest 
from the world’s media and engagement by 
the general public in this frontier endeavour. 
And it may be only the beginning ofa bigger 
project. As the crust and the mantle are both 
likely to be different from place to place, we 
will ultimately want a number of such holes. 
That may seem a distant dream, but ultra- 
deep drilling will only get more routine and 
cheaper with time and experience. 

As Harry Hess told a US National Acad- 
emy of Science meeting in April 1958, when 
defending the first Mohole Project against 
detractors: “Perhaps it is true that we wont 
find out as much about the Earth’s interior 
from one hole as we hope. To those who raise 
that objection I say, if there is not a first hole, 
there cannot bea second ora tenth or a hun- 
dredth hole. We must make a beginning.” = 
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Further reading accompanies this article online at 
go.nature.com/b2ehus. 
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Ethiopian meteorologist Tufa Dinku working with Kenyan epidemiologist Judy Omumbo at Columbia University, New York. 


Africa needs climate 
data to fight disease 


Madeleine C. Thomson and colleagues call on climate and health researchers, 
policy-makers and practitioners to work together to tackle infectious diseases. 


( "nur variability and change are a 
major concern for public health in 
Africa. The livelihoods of hundreds 

of millions of people there are dependent 

on rain-fed agriculture and seasonal water 
resources. Poor rural communities also 
suffer from undernutrition and bear the 
greatest burden of infectious diseases and 
natural disasters while having the least access 
to public-health services. Many of Africa's 
most important cities are on the coast and 
at risk of sea level rise. Without adequate 
infrastructure they are vulnerable to poor 
sanitation during floods and shortages of 
drinking water and loss of hydroelectric 
power during droughts. Rising tempera- 
tures, air pollutants and dust threaten to 
increase heat stress and respiratory disease. 

The Fourth Assessment Report of the 
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Intergovernmental Panel on Climate Change 
predicts increased rainfall in eastern Africa 
over the coming century’. Yet there has 
been a region-wide drought over the past 
ten years’. Policy-makers want to know 
whether to prepare, short-term, for floods 
or droughts. They also need to know if the 
recent drying has aided malaria-control 
interventions in the region. But answering 
such questions is tricky. 

Climate information is not readily 
available, so is rarely incorporated into 
development decisions. At the same time, 
few public-health institutions or practition- 
ers are equipped to understand or manage 
the effects of a changing climate’, despite 
major advances in recent years in alerting 
the health community to its risks. 

A dramatic improvement is needed in 
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the availability of relevant and reliable 
climate data and services, particularly in 
Africa, where vulnerability to climate is 
so high. Information — such as historical 
observations of temperature, ten-day satel- 
lite estimates of rainfall, the predicted start 
date of the rainy season or the likelihood of 
extreme temperatures in the coming season 
— should inform the management of all 
diseases sensitive to climate. These include: 
malaria, leishmaniasis, acute respiratory 
infections, intestinal helminths and diar- 
rhoeal diseases. This information could also 
contribute to food security by providing, for 
example, early warning for agricultural and 
livestock pests and diseases. 

The following must be put in place within 
the next decade: new partnerships between 
the public-health community and national 


A.S. ORLING 


SOURCES: REF. 7; G. D. SHANKS ET AL. GO.NATURE.COM/CPN7KD 


meteorological agencies, space agencies 
and researchers; a governance structure 
that ensures data sharing between pub- 
lic and private agencies; a funding model 
that builds open-access climate databases; 
climate scientists focused on the delivery 
of quality products, tailored to user needs; 
health professionals trained to demand and 
use climate information; and evidence of the 
value of all this, relative to alternative invest- 
ments in health. 


TRANSFORMATIONAL POWER 
Good climate information, if freely avail- 
able, could transform the way in which 
the health community does business. For 
example, it could improve health calendars 
for seasonal diseases. It could lead to better 
timing of the distribution of bed nets, local 
public awareness campaigns, and drugs with 
a short shelf life. Health professionals could 
be better prepared for the diseases that fol- 
low floods and storms, such as leptospiro- 
sis and cholera. It would also enable better 
mapping of regions and populations vulner- 
able to emerging health problems such as 
meningococcal meningitis epidemics, which 
favour the hot, dry and dusty Sahel, a region 
that may be expanding owing to climate 
and environmental change. On longer time 
scales, researchers could probe the drivers 
and potential recurrence of major climatic 
events, such as the devastating Ethiopian 
drought of 1984-85 (immortalized in the 
Western psyche by Bob Geldof’s Band Aid). 
On 4-6 April, in Addis Ababa, profession- 
als — including policy-makers, practitioners, 
researchers, donors and the media — inter- 
ested in using climate science to inform 
public-health decisions will meet at the 
Climate and Health in Africa: 10 Years On 
conference. Participants must reflect on 
the slow progress since the groundbreaking 
Climate Prediction and Disease/Health in 
Africa workshop in Bamako, Mali, in 1999. 
There, climate scientists committed to work- 
ing directly with health practitioners and 
researchers. Participants must now agree a 


GOING UP 


road map for concrete action over the next 
decade. 

The long-running debate on the likely 
effect of a warming climate on malaria in 
East Africa illustrates the problem. Over the 
past decade, numerous studies, with contra- 
dictory results, have attempted to explain the 
observed rise in malaria incidence in the 
western Kenyan highlands. For instance, 
peak monthly cases of malaria increased 
eightfold from the 1970s to the 2000s at the 
hospital that serves the Brooke Bond Farms 
(now Unilever Tea Kenya), tea estates near 
Kericho (with some tailing off in recent 
years)*, 

Researchers who found no evidence 
of significant warming at this site dur- 
ing the same period concluded that only 
non-climatic factors (for example drug 
resistance, and local environmental change) 
could be driving the increased malaria inci- 
dence*. Those who found evidence of warm- 
ing proposed that climate must play a part’. 
These conflicting results fuelled a heated 
and polarizing debate’ in the malaria and 
climate-change literature (including this 
journal) and the media. The spat paralysed 
the discussion of climate and health at the 
highest policy level. 

The limited access to daily observations 
of surface air temperature from meteoro- 
logical stations, in quality-controlled, long, 
time series has constrained studies. They 
have relied heavily on short time series, used 
data of poor quality or ignored local ground 
observations in favour of spatially interpo- 
lated global data sets that could not provide 
meaningful results at local scale. 

The value of the data held by national 
meteorological and hydrological ser- 
vices was made evident through a recent 
analysis of 30 years (1979-2009) of daily 
temperature and rainfall data from the 
Kericho meteorological station managed 
by the Kenya Meteorological Department; 
the data conform to World Meteorological 
Organization standards. The study’ (the 
authors of which include S.J.C. and M.C.T.) 


Malaria incidence and temperatures have risen near Kericho in Kenya over the past 30 years; 
health experts are keen to know whether they are linked. 
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establishes that minimum and maximum 
temperatures, at this much-studied site, 
have been rising by about 0.2 °C a dec- 
ade over the past 30 years. Furthermore, 
the study demonstrated the association 
between local temperatures in Kericho and 
sea surface and land surface temperature 
in the tropics — indicating the close rela- 
tionship of the local Kericho climate to the 
larger climate system (see ‘Going up’). 

Only now can researchers start to look 
properly for direct links between malaria 
incidence and climate variability and change 
in the area. We cannot yet say that warming 
helped the observed increase in malaria — 
although there is good reason to believe that 
it might*. But we can say that climate should 
not be dismissed as a potential driver in the 
area. The key concern is that, ifthere is a link, 
increased warming in the East African high- 
lands may expose a largely naive population 
to malaria with devastating consequences. 

That it tooka decade to establish a robust 
analysis of climate trends in Kericho, a focus 
of so much controversy, points to a broader 
disconnect between those who need climate 
information and those who produce it. In 
the 1980s, African meteorological agen- 
cies were encouraged to sell their data to 
raise revenue to maintain their networks of 
meteorological stations. The agencies’ ser- 
vices have understandably prioritized their 
primary client, the airline industry. Access 
for non-commercial purposes, including 
for malaria research, has been constrained 
by poor collaboration and high data fees, 
among other factors. 

Instead, funding models are needed that 
recognize climate data as a resource for 
development — a classic public good that 
increases in value the more times the data 
are used. 


ETHIOPIAN PROMISE 

The potential benefits of getting it right are 
considerable. Ethiopia, a country particu- 
larly vulnerable to the vagaries of climate, 
provides a promising example of what might 
be achieved. A new climate database there 
will draw on more than 600 national mete- 
orological stations merged with 30 years of 
10-day satellite data collected for rainfall 
monitoring. Generally fewer than 20 stations 
are available internationally via the Global 
Telecommunications System — the source 
of ground observations used in most avail- 
able rainfall monitoring products. Ethiopia's 
National Meteorological Agency is getting 
technical support from the International 
Research Institute for Climate and Society 
(the IRI — where M.C.T., S.J.C. and S.E.Z. 
work) and the Tropical Applications of 
Meteorology using Satellite data (TAMSAT) 
research group at the University of Reading, 
UK. The effort is funded by Google.org, the 
philanthropic arm of Google, and lent > 
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> technical support by the IRI through a 
cooperative agreement between Columbia 
University in New York (where the IRI is 
located) and the US National Oceanic and 
Atmospheric Administration. 

This high-quality database will be used to 
create free-access climate reports tailored to 
the needs of the Ethiopian health commu- 
nity® and other development sectors, such 
as agriculture and water resources. It will 
also be used to improve the assessment of 
climate-sensitive interventions such as the 
indoor spraying programmes supported 
through the Roll Back Malaria initiative. One 
would expect such measures to work when 
conditions are least favourable to malaria 
transmission, for example, during a drought. 
The database will also help the development 
of local, seasonal climate forecasts — of 
unusually wet or dry conditions, say. 

The Ethiopian climate database, the 
first of its kind, provides an opportunity to 
establish the value of climate information 
to improving health. Now that the system 
has been developed, the process can be more 
readily repeated in other countries. Doing 
so will build capacity where it is needed 
most — in the national meteorological 
agencies, regional climate centres and local 
universities’. 


BETTER TOGETHER 

Health professionals need skills to under- 
stand and interpret climate data, and to 
request new types of information or ser- 
vices. They also need to develop mecha- 
nisms to incorporate this information into 
current epidemiological approaches in a 
cost-effective manner. 

One way forward is to target professional 
training and research in schools of public 
health. For instance, health-surveillance 
communities routinely monitor and 
prevent outbreaks and epidemics, through 
the analysis of current and historic epide- 
miological data. Where such events are 
climate-sensitive (for example Rift Val- 
ley fever epidemics) seasonal forecasts, 
meteorological information and satellite 
data could help map, monitor or anticipate 
changes in risk. 

The Climate Information for Public 
Health Action Network, led by the IRI, and 
its associated training are steps in the right 
direction. The curriculum” enables climate 
and health experts to work together on 
common data sets and analyses, focusing 
the results on the needs of decision-makers. 
As a result, the African Field Epidemiology 
Network is exploring how climate informa- 
tion might be used in training for outbreak 
investigation. 

The Climate for Development in Africa 
project was launched in October 2010. 
This is a joint initiative of the African 
Union Commission, the United Nations 
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Economic Commission for Africa and the 
African Development Bank. The project has 
a start-up fund of US$136 million and a clear 
mandate from African heads of state to help 
fill key gaps in policy, practice, services and 
data across the continent. It is a daunting but 
necessary task. To achieve its development 
targets, the initiative will need to respond 
directly to the needs of climate-sensitive 
sectors, including health. 

The global health community has worked 
for decades to get the resources necessary for 
effective control of diseases that affect poor 
people globally, especially malaria. Some 
people understandably fear that hard-won 
gains in political and financial support may 
be diluted, or worse derailed, by the climate- 
change agenda — especially in such aid-slash- 
ing times. But ‘turf’ anxieties are no reason 
for poor science. True interdisciplinarity 
requires more than fair-weather friends. 

Climate is most important as a driver of 
infectious disease where and when control 
efforts are weak and societies are poor. Climate 
information can help to put resources where 
they are needed most. It is an essential 
additional layer of data for disease prevention, 
control and elimination. 

History tells us that success against a 
single infectious disease such as malaria may 
be short lived if we are over-reliant on too 
few controls and lose a broad understanding 
of the disease. Rather than pursuing parallel 
agendas, the climate and health communities 
must work together now to deliver measur- 
able health improvements in Africa in the 
next ten years and beyond. = 
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Lewis Terman’s study found that people who were conscientious as children were likely to live longer. 


A long, diligent life 


A 90-year cohort study hints that personality plays a 
unexpected part in lifespan, finds Marten Lagergren. 


hat factors predict along, healthy 
and successful life? In 1921, 
psychologist Lewis Terman at 


Stanford University in California embarked 
on an ambitious project to find out. He 
selected around 1,500 gifted children from 
schools in the state and followed them from 
the age of 11 into adulthood, collecting a 
variety of data to see what might predict 
later success and accomplishment. Con- 
tinued after his death in 1956 by other 
Stanford researchers and still ongoing, 
Terman’s project has become the world’s 
longest-running longitudinal study. 
Psychologists Howard Friedman and 
Leslie Martin have brought the work up to 
date by painstakingly collecting information 
on those subjects who have died. Using death 
certificates, they have determined the length 
of life and cause of death, opening up a range 
of new analyses. The Longevity Project sum- 
marizes their findings on how life circum- 
stances link to health outcomes, albeit for 


this select group. Some results are as expected, 
such as that smoking is bad for longevity. 
Others turn conventional wisdom on its head. 
For example, working hard for long hours 
in a demanding job to achieve high status 
is better for your health and life expectancy 
than taking it easy and lacking ambition. 
Marriage is a blessing for men more than 
women; and men suffer more adverse health 
effects from divorce, perhaps turning to 
drink or drugs. The authors emphasize the 
benefits of an active social network — more 
common for women — as a buffer against 
life’s harmful events. And they are critical of 
simple health advice, such as to jog or eat 
less fat, arguing that it is the whole approach 
to life that is essential, not the details. To 
give a person a list of 
health recommenda- 


tions does not work, _ Forareviewoftwo 
they point out, ifthe books onlongevity 
person cannot or does _ science, see: 


not follow them. 
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BOOKS & ARTS 


Friedman and 
Martin explore how 
personality influences 
lifestyle choices using 
Terman’s meticulous 
records of the character 
traits of the children 
he followed, as noted 
by parents and teach- 
ers. When the chil- 
dren were interviewed 


The Longevity 
Project: Surprising 


as adults 20 years later, piccoyeries for 
the same qualities were Health and Long 
evident. The best pre- _ Life from the 
dictor of along and — Landmark Eight- 
healthy life turned out Decade Study 


HOWARD S. FRIEDMAN 


to be conscientious- asi CR Ea 


ness — the extent to 


Hudson Street Press/ 
which a child was  HayHouse: 2011. 
prudent, dependable 272 pp. 
and persistent in the $25.95/£10.99 
accomplishment of his 
or her goals. 


Conscientious people do more to protect 
their health and are less likely to engage in 
risky activities such as smoking, drinking 
or drug-taking, the study found. They also 
find their way to happier marriages, better 
friendships and optimum work situations. 
As a result, they are less likely to die from 
all causes. 

Being physically active as a child is also a 
predictor of longevity, but only if that activ- 
ity is maintained into and beyond middle 
age. The life-years gained by jogging may 
amount to no more than the time you spend 
doing it, the authors note. So we neednt all 
aim to run marathons; rather, we should just 
maintain an activity that we enjoy. 

Allthis might seem to suggest that our fate 
is largely determined from childhood, but 
Friedman and Martin take a more construc- 
tive view. We can work with our personality 
to improve our health, they say, but it takes 
time for the benefits to accrue. You do not 
become conscientious overnight. It is the 
long-term, determined work of adopting 
and sticking to healthy habits and seeking 
good social environments and relationships 
that makes the difference. Later follow-up 
of Terman’s subjects showed that consci- 
entiousness in middle age and later counts 
almost as much as in childhood. 

The authors provide self-tests for read- 
ers on a range of health-related factors such 
as catastrophizing (the tendency to always 
imagine the worst), life satisfaction, physical 
activity, marital happiness, job passion and 
accomplishment, and social-support net- 
work. They explain what these factors mean 
and provide guidance for improving one’s 
lifestyle. 

The difference in length of life between 
men and women has always intrigued 
epidemiologists and demographers. The Lon- 
gevity Project adds a new twist by suggesting 
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that behaviour also influences the longer 
lifespan of women. In the study, chil- 
dren of either sex who were drawn to 
masculine careers (those shown by tests 
to be mostly preferred by men, such as 
being a mechanical engineer or pilot) 
had a shorter lifespan than those who 
preferred more feminine occupations 
(such as being an interior decorator or 
working with children). Thus, cultural 
dimensions may explain why life expec- 
tancy for the sexes differs over time and 
between countries and cultures. 

There are caveats to this milestone 
study. One issue is that it was originally 
planned for a narrower purpose: to inves- 
tigate predictions of career success and 

failure. Terman 


“Children of picked white 
either sex who pupils with 
were drawn high IQs from 
to masculine s . BF : gee 
careers hada Seno Sy SOE 
° sample is not 

shorter lifespan ae 
than those representative {o} 
h d the wider popu- 
vee ref lees lation. Conclu- 
more f CUeMES sions cannot be 


AY »” 
occupations. drawn concern- 


ing minority 
groups, educational level, social class or 
geographic area. The authors do their best 
to account for these limitations in their 
analyses. 

Another problem is inevitable in any 
longitudinal study. Terman’s subjects, 
who were born around 1910, had very 
different lives from ours. Many soci- 
etal changes have occurred in the past 
century, particularly in gender roles. 
Terman’s subjects, known as Termites, 
lived at a time when most women were 
expected to stay at home. The different 
life choices available today are likely to 
result in smaller gender differences in 
health and longevity. 

The Longevity Project focuses mainly 
on the individual. The role of society in 
fostering good heath and long life is 
seldom mentioned in the book, except 
when exposing the failure of current 
health propaganda. Despite ubiquitous 
recommendations to eat less and keep 
fit, obesity rates in the United States and 
in many other developed countries are 
soaring. The authors recognize that other 
studies are badly needed to examine the 
impacts of public policy on health and to 
develop more successful approaches. As 
they show in this excellent book, it will be 
a difficult task. But it is necessary. m 


Marten Lagergren is an assistant 
professor at the Stockholm Gerontology 
Research Center, Stockholm, Sweden. 
e-mail: marten. lagergren@aldrecentrum.se 
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Giant ground sloths went extinct some 10,000 years ago, but could provide conservation lessons for today. 


CONSERVATION 


After the auroch 


Emma Marris is gripped by an account of our love-hate 
relationship with extinct megafauna. 


extinct mammals of the Pleistocene 

Epoch have nowhere near the legions 
of fans claimed by dinosaurs. Mammals 
win the popularity contests among existing 
animals, yet few children can rattle off the 
weights and dietary habits of the gargantuan 
North American ground sloth Megalonyx 
jeffersonii or Australia’s massive buck- 
toothed marsupial Diprotodon optatum. 
Stegosaurus gets all the love. 

One fanciful explanation is that we have 
an abiding guilt for having killed them all 
off in our spear-hurling days. And it seems 
likely that human hunting played some 
part in many of these extinctions. In Once 
and Future Giants, biologist and journalist 
Sharon Levy lays out the evidence for this 
theory — and explores what this species 
drain can teach us now. The patterns and 
consequences of the Pleistocene die-offs can 
help us to predict how landscapes will change 
if we lose big mammals, and help us to spot 
warning signs of impending extinctions. 

As we hesitantly take collective respon- 
sibility for these extinctions, we feel their 
loss more keenly. Today’s ‘wild’ has dimin- 
ished along with the 


|: puzzles me that the many large, now 


rugged landscapes 
begin to look tame 
and denuded. North 
America’s wolves and 
grizzlies no longer 
thrill; Yellowstone Park 
looks like a petting zoo. 
“We live in a highly 
abnormal world,” 
writes Levy, quoting 


US palaeoecologist 
David Burney. “We 
think of ground sloths 
and saber-toothed cats 
as peculiar and for- 
eign, but it is the world 
of our own ancestry, 
the world our species 
evolved in” 


Once and Future 
Giants: What Ice 


Age Extinctions 
Tell Us About the 
Fate of Earth’s 
Largest Animals 
SHARON LEVY 
Oxford University 
Press: 2011. 280 pp. 
$24.95 
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So, scientists and conservationists 
who can easily envision the landscapes of 
13,000 years ago, just before the late Pleisto- 
cene extinctions, find themselves yearning 
for the past. They are starting to experiment 
with restoring these landscapes by introduc- 
ing surrogates to fill long-vacant ecological 
roles — to graze, to browse, to kill, to knock 
over trees, even to terrify. 

Levy recounts various rewilding experi- 
ments. Some have been intentional, such as 
the Pleistocene Park nature reserve in north- 
eastern Siberia, where rare native Yakutian 


MARK HALLETT PALEOART/SPL 


horses roam. Others were accidental, such 
as the wild-mustang preserves of the Amer- 
ican West. She reports on recent research 
supporting the notion that large animals 
are more than simply appealing — they can 
be major engineers of their ecosystems. Big 
predators such as the wolves of Yellowstone 
prevent herbivores from munching plant 
populations into oblivion and keep a lid on 
smaller predators. Big herbivores like the 
musk oxen of Greenland stop forests and 
weeds from overrunning the earth. They 
fertilize with their dung, and turn the earth 
with their big hooves. 

Levy notes that many of the surrogates 
that conservationists use are the domes- 
ticated descendants of wild creatures. 
Specially bred cattle are used as proxies for 
extinct aurochs, the giant wild cattle that 
once roamed Europe, but Levy says that 
the modern cattle pale in comparison. Real 
aurochs — the kind painted by our ances- 
tors in caves — were “longer ofleg, bigger of 
brain, more graceful and fearless than their 
domesticated brethren’, she speculates. 

The slightly mournful lesson of the 
book is this: any large animals we add to 
landscapes must be carefully managed. 
For example, condors reintroduced in the 
United States wear radio collars; wild mus- 
tangs are rounded up by the US government, 
dividing family groups and leaving excess 
animals held in pens. What differentiates 
such animals from pets? 

To be truly wild, according to Levy, 
animals must have their numbers con- 
trolled by wild predators, not by humans. 
They must also live with fear. “The threat of 
a hungry carnivore lurking at the water hole 
is the essence of the truly wild horse,’ she 
writes. And yet the idea of reintroducing 
predators — the key to wildness — is the 
most difficult to sell to local peoples around 
the world. Conservationists might love the 
thought of introducing African lions to the 
Great Plains in a bid to fill the gap left by 
the extinct American lion, but ranchers 
and rural residents understandably have 
qualms. 

“We cannot raise the auroch, but its 
tamed descendent may yet fill a vital ecolog- 
ical niche,’ concludes Levy in her examina- 
tion of the increasing use of domestic cattle 
in conservation projects. Where once there 
were mammoths clashing tusks, giant short- 
faced kangaroos and woolly rhinoceroses, 
we now have Bessie the cow, grazing and 
fertilizing the soil and raising her head in 
vague interest as cars whizz past. It is one 
way of plugging the megafauna gap, but 
I long for the grandeur and strangeness of 
those lost giants. m 


Emma Marris is a writer based in 
Columbia, Missouri. 
e-mail: e.marris@gmail.com 


Books in brief 


On Being: A Scientist’s Exploration of the Great Questions 

of Existence 

Peter Atkins OXFORD UNIVERSITY PRESS 152 pp. $19.95 (2011) 

Why are we here? Chemist and author Peter Atkins answers this big 
question succinctly and elegantly in this slim volume. Following in 
the footsteps of rationalists such as Richard Dawkins, he argues that 
we should find as much awe in the workings of science as we might 
in any god. Although he acknowledges the role of spiritual beliefs 

in society and the comfort they can bring to some, he finds greater 
solace in the scientific underpinnings of origins and endings, birth 
and death. 


Naked Genes: Reinventing the Human in the Molecular Age 
Helga Nowotny and Giuseppe Testa MIT PRESS 192 pp. £18.95 (2011) 
Advances in the life sciences have revealed many previously hidden 
aspects of biology, from the genes and proteins within cells to the 
developmental stages of the fetus. European Research Council 
president Helga Nowotny and stem-cell scientist Giuseppe Testa 
argue that these building blocks are not valueless, but are ‘naked’ 
blank canvasses that take on multiple meanings in different social 
contexts, from court rooms to parliaments. They assess how these 
varied perspectives influence attitudes to biotechnology in topics 
such as assisted reproduction and personalized medicine. 


Pox: An American History (Penguin History of American Life) 
Michael Willrich PENGUIN PRESS 400 pp. $27.95 (2011) 

Attitudes to public-health interventions have not changed much in 
the past 100 years, explains historian Michael Willrich. He describes 
how measures at the turn of the last century to stem the spread 


Aa of a smallpox epidemic in the United States — using quarantines, 
j X pesthouses and ‘virus squads’ — were met with suspicion and 
Weer popular resistance despite their success. A well-organized 
“tome, J anti-vaccination movement sprang up to champion personal 
, choice over powerful government, resulting in the disputed political 


landscape around inoculation that is familiar today. 


Beyond the Finite: The Sublime in Art and Science 

Edited by Roald Hoffmann and lain Boyd Whyte OXFORD UNIVERSITY 
PRESS 208 pp. $24.95 (2011) 

How should we depict protein folding or negative mass? Scientists 
must create new imagery to describe such natural concepts every 
day, and in that sense they have a lot in common with artists who 
attempt to display the sublime. Nine scholars of science and art 
convey their perspectives in this volume. From the beauty of images 
taken by the Hubble Space Telescope to quantum romanticism, the 
contributors touch on natural aesthetics in physics, neuroscience, 
chemistry, painting and music. 


Bird Watch: A Survey of Planet Earth’s Changing Ecosystems 
Martin Walters UNIVERSITY OF CHICAGO PRESS 256 pp. $45 (201 1) 
Bird populations worldwide are threatened by climate change 

and environmental destruction. This illustrated survey, produced 
in cooperation with the global conservation partnership BirdLife 
International, documents all 1,227 endangered bird species on 
the Red List of the International Union for Conservation of Nature. 
Region by region, the book describes the birds’ habitats and the 
environmental pressures on them, as well as charting conservation 
efforts and top birding sites around the globe. 
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ALAN MASON CHESNEY MEDICAL ARCHIVES OF 
THE JOHNS HOPKINS MEDICAL INSTITUTIONS 


| COMMENT | BOOKS & ARTS 


Clinical precision 


W. F. Bynum enjoys a history of three revolutionary 


moments in health care. 


arvard physiologist Lawrence 
Hitters once remarked that at 
some time between 1900 and 1912, 
a random patient with a random disease, 
choosing a physician at random, had for 
the first time in history a better than 50:50 
chance of profiting from the encounter. 
Henderson’s spread bet is not quoted 
in The Making of Modern Medicine, but it 
concurs with historian Michael Bliss’s take 
on how, when and where clinical medi- 
cine became modern. In this slim volume, 
based on his 2008 Joanne Goodman lecture 
series at the University of Western Ontario 
in Canada, Bliss offers three case studies 
that chart the leap in physicians’ capacity to 
deal with disease at the start of the twenti- 
eth century. Health care was revolutionized 
by advances in disease prevention, surgery 
and drug treatments that allowed manage- 
ment of chronic afflictions. Bliss’s measured 
evaluation of the strengths and weaknesses 
of modern medicine is persuasive and clear. 
His first case study is a smallpox epidemic 
in Montreal, Quebec, in 1885, which sick- 
ened more than 20,000 people and killed 
about 5,000. It could have been prevented: 
vaccination was almost a century old, and 
the dynamics of smallpox spread were well 
understood. But anti-vaccination sentiment 
was rife. Bliss describes how social factors led 
to the outbreak being met with a mix of fatal- 
ism and ignorance. Most of the victims were 
children who were poor, French-speaking 
and Catholic. Their religious leaders failed 
them: inoculation was viewed as an affront 


to divine providence 
and an intrusion of 
the state on a helpless 
citizenry. Two unedu- 
cated physicians led 
the resistance, tell- 
ing parents that the 
removal of children 
to the quarantine 
hospital was a death 
sentence. As a result, 


= 


‘ 


The Making of 
Modern Medicine: 


infected children were Jyynin g Points in 
keptinthecommunity — the Treatment of 
and became sourcesof Disease 

further spread. MICHAEL BLISS 


University of Toronto 
Press: 2010. 112 pp. 
Can$21.95 


It took ten months 
for the epidemic to 
burn itself out. Many 
religious leaders maintained that the episode 
was an act of God sent to a sinful people. But, 
Bliss explains, most medical observers and 
liberal journalists at the time recognized it 
as a catastrophe that should never have hap- 
pened, caused by antiquated values that were 
being supplanted by the new creed of science. 

Spearheading that scientific approach was 
the Johns Hopkins School of Medicine in Bal- 
timore, Maryland, the beginnings of which 
Bliss relates in his second case study. Its first 
professor of medicine was William Osler, 
who had left McGill University in Montreal 
in 1884 — a year before the epidemic — and 
moved to Johns Hopkins in 1889 after a spell 
at the University of Pennsylvania in Philadel- 
phia. Osler excelled as a teacher and medical 
role model. He is still revered as such. 


The Johns Hopkins School of Medicine, founded in 1893, pioneered the scientific method in health care. 
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The foundation of the Johns Hopkins 
School was a turning point in American 
medicine: it demanded that its students had a 
sound educational grounding, and provided 
both the facilities and the directive for its 
staff to be at the forefront of research as well 
as teaching. Its excellence was recognized 
in a gold-standard award from educational 
reformer Abraham Flexner, whose famous 
reports on the condition of medical educa- 
tion in North America in 1910 and in Europe 
in 1912 pulled no punches. 

Osler was the institution's star, but Bliss 
concentrates on his surgical colleague, 
Harvey Cushing, as the most significant 
innovator. Cushing pioneered modern 
neurosurgery, creating the field almost single- 
handedly. He arrived at Johns Hopkins as a 
resident under the school’s founding profes- 
sor of surgery, William Halsted. Bliss explains 
that much of the therapeutic power of medi- 
cine in those days stemmed from surgery, 
a point that Osler also acknowledged. Sur- 
geons could cure appendicitis, gall or kidney 
stones and other conditions that physicians 
could merely manage. Cushing could even 
operate successfully on the brain. Teachers 
and students at Johns Hopkins knew they 
were at the forefront of a medical revolution. 

The decades before and after 1900 wit- 
nessed several medical innovations, including 
aspirin. But nothing captured the headlines 
like Bliss'’s third case study: the extraction of 
insulin in 1921 by Frederick Banting, Charles 
Best and James Collip in J. J.R. McLeod’s lab 
at the University of Toronto. Louis Pasteur’s 
rabies vaccine in the 1880s and Wilhelm 
Ro6ntgen’s discovery of X-rays in 1895 were 
comparable media events. But the dramatic 
pictures of emaciated children on the verge of 
death suddenly putting on weight after injec- 
tions of insulin seemed almost miraculous. 

Insulin did not cure type 1 diabetes, but 
it transformed an acute, fatal disease into a 
chronic one compatible with many years of 
life. The parallels between the treatment of 
type 1 diabetes in the 1920s and the treatment 
of HIV in our own time are striking. 

We should be grateful that many diseases 
can now be managed. Yet Bliss reflects that 
the gratitude of people receiving medical 
treatment, as enjoyed by physicians half a 
century ago, has collapsed. He attributes this 
to a misplaced faith in medicine in a secular 
society. The fatalism of the Montreal poor 
has been replaced by an ardent expectation 
that medical science can solve the problems 
of the human condition. Asa result, more of 
us may empathize with the poet Alexander 
Pope’s sober analysis from more than two 
centuries ago: “This long disease, my life.” m 


W. E Bynum is emeritus professor of the 
history of medicine at University College 
London, UK. 

e-mail: w.bynum@ucl.ac.uk 
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Benedict Cumberbatch plays Frankenstein’s creation in a London adaptation directed by Danny Boyle. 


A monstrous tale 


Anew staging of Frankenstein plays up the monster 
but draws no morals about science, finds Philip Ball. 


ary Shelley’s Frankenstein has 
Mie adapted and reinterpreted 
innumerable times since it was 
first published, anonymously, in 1818. Vic- 
tor Frankenstein has become the archetypal 
mad scientist, unleashing powers he cannot 
control. The ‘Franken-’ label is attached to 
every new technology that seems to inter- 
vene in life, from genetic modification of 
crops to Craig Venter’s synthetic microbe. 
The latest staging of the tale is Nick Dear’s 
adaptation for the National Theatre in Lon- 
don, directed by Danny Boyle, famous for 
films including Trainspotting (1996) and 
Slumdog Millionaire (2008). The production 
is spectacular, and intelligent choices were 
made in the structure of the play (ifnot always 
in the dialogue). By placing the creature at the 
heart of the performance (played by Jonny 
Lee Miller, on the night I saw it), Dear has 
wisely emphasized the human message of the 
story rather than attempting any ill-advised 
jibe at the hubris of contemporary science. 
The first adaptations for the stage began as 
early as the 1820s. These transformed Frank- 
enstein’s creature into the now-familiar mute, 
shambling brute, at that time based on the 
stock theatrical character of the Wild Man. 
This personification continued in the first 
film adaptation in 1910, simply called Frank- 
enstein. As well as the iconic screen version 
by James Whale in 1931 with Boris Karloff’s 
creature, there have been countless other ren- 
ditions, sequels, parodies (Mel Brooks's 1974 


Young Frankenstein, Frankenstein 


The Rocky Horror Pic- — ADAPTED BY NICK 

ture Showin 1975) and = DEAR. DIRECTED BY 

postmodern interpre- DEN Be 

tations (Brian Aldiss’s Nationar ines, 
London. 


Frankenstein Unbound; 
Jonathan Cape, 1973). 

Some might lament how the original novel 
has been distorted and vulgarized in that 
process. British literary critic Chris Baldick 
has a wiser perspective. He argues in his book 
In Frankenstein's Shadow (Oxford University 
Press, 1987) that in interpreting a myth, we 
must consider all its versions: “That series of 
adaptations, allusions, accretions, analogues, 
parodies, and plain misreadings which 
follows upon Mary Shelley's novel is not just 
a supplementary component of the myth; 
it is myth.” After all, there is no definitive 
version of Shelley’s story. She made small 
but significant changes in the third edition 
in 1831, emphasizing the Faustian themes of 
presumption and retribution on which the 
early stage versions insisted. 

Critics still dispute what Shelley’s message 
was meant to be. Far from offering a simplistic 
critique of scientific hubris, the story might 
instead echo Shelley’s troubled family life. 
Her mother, the feminist and political radical 
Mary Wollstonecraft, 
died from compli- 
cations after Mary’s 
birth, and her father, 
William Godwin, all 


Until 2 May 2011. 


> NATURE.COM 
For areview of Philip 
Ball’s Unnatural: 
go.nature.com/jgmbl5 


© 2011 Macmillan Publishers Limited. All rights reserved 


BOOKS & ARTS | COMMENT | 


but disowned her after she eloped to Europe 
with the poet Percy Shelley in 1814. She 
lost her first child, named Clara, that year, 
subsequently describing a dream in which the 
girl was reanimated. There is ample reason 
to believe Percy Shelley's statement that the 
central moral of Frankenstein is: “Treat a 
person ill, and he will become wicked” 

If so, Dear’s adaptation has returned to the 
essence of the tale: it focuses on the plight of 
the creature, whose lone and awkward ‘birth’ 
begins the play. We see how this mumbling 
wretch, spurned as a hideous thing by Victor, 
is reviled by society until finding refuge with 
the blind peasant De Lacey. The kindly old 
man teaches the creature how to speak and 
read using Milton's Paradise Lost, the story of 
Satan’s Promethean challenge to heaven. 

Eventually, De Lacey's son and daughter-in- 
law return from the fields and drive out the 
creature in horror, whereupon he burns them 
in their cottage. These scenes are the moral 
core of Shelley’s novel, and in placing them so 
early, Dear signals that this is very much the 
monster's show. In fact, perhaps too much. For 
while the creature is the most fully realized, 
most sympathetic and inventive incarnation 
Ihave seen, Victor Frankenstein is left with 
little to do but recoil from him and neglect all 
his other duties: martial, filial and moral. It is 
clear from the outset who is the real monster. 

Throughout the run of this production, the 
two lead actors — Benedict Cumberbatch and 
Miller — alternate the roles of Victor and his 
creature. This Doppelganger theme is not a 
new one. In the 1927 British stage adaptation 
by Peggy Webling that formed the basis of 
Whale’s movie, the creature appeared dressed 
like the scientist, who foreshadows the later 
elision of creator and creature by saying: “I call 
him by my own name — he is Frankenstein.” 

The scientific elements of the tale are skated 
over. Mary Shelley provided just enough hints 
for the informed reader to make the connec- 
tion with Italian physician Luigi Galvani’s 
then-recent work on electrophysiology. Dear 
has Frankenstein mention galvanism and 
electrochemistry, but that is as far as it goes. 
There is no serious attempt to make the play a 
comment on the “Promethean ambitions” of 
modern science, as Pope John Paul II called 
them in 2002. This is a relief, because the 
trope of a solitary experimenter exceeding 
the bounds of God and nature is no longer 
the relevant vehicle for a critique. 

Whether Dear adds anything new to 
the legend — as Whale and arguably even 
Brooks did —is debatable. But it is good to 
be reminded that the novel may be read not 
so much as a Gothic tale of monstrosity and 
presumption, but as a comment on the conse- 
quences of how we treat one another. m 


Philip Ball is author of Unnatural: The 
Heretical Idea of Making People. 
e-mail: p.ball@btinternet.com 
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Scientists should 
cut waste too 


Your call for scientists to rally 

for continued federal funding 
(Nature 470, 305; 2011) places no 
responsibility on them to reduce 
the $1.3-trillion US budget deficit. 

As many scientists depend on 
taxpayers money for research, 
they have an obligation to 
reduce waste and inefficiency 
and to work within their means. 
Funding agencies cannot and 
should not continue to do 
business as usual. 

For example, the National 
Institutes of Health (NIH) 
imposesa salary cap of $199,700 
for scientists; most other federal 
agencies do not. The ‘indirect 
costs’ claimed by academic 
institutions range from 55% to 
60% of the total grant budget. 
This implies that the taxpayer 
will pay $199,700 for an NIH- 
funded radiologist but $398,571 
if the post were funded by 
another agency. Also, 55-60 
cents of every research dollar 
will be spent on administrative 
and facilities costs, even though 
buildings and utilities have been 
paid for many times over. 

Unlike companies, non-profit 
academic institutions deliver 
a paltry return on taxpayers’ 
investments. In 2010, after 
spending nearly $3.1 billion of 
taxpayers’ money on intramural 
research, the NIH received $91.6 
million in royalties and was issued 
with 134 patents. By contrast, 
in 2009 IBM spent $6.5 billion 
on research and development, 
generated $15.1 billion in revenue 
and was issued with 4,914 patents. 
Matthew Kumar Mayo Clinic, 
Rochester, Minnesota, USA. 
mkumar@mayo.edu 


Anthropology: it can 
be interdisciplinary 


Adam Kuper and Jonathan 
Marks’s gloomy portrait of 


integrative, big-question 
research in anthropology (Nature 
470, 166-168; 2011) does not 
square with the large body 

of literature that covers areas 
such as behavioural ecology, 
cultural evolution, cognitive 
anthropology, gender studies, 
cross-cultural economics, moral 
psychology and environmental 
change. Publishing this work 

in high-impact general science 
and focused interdisciplinary 
journals ensures wide attention 
beyond the discipline. 

The Evolutionary 
Anthropology Society was 
created to cut across traditional 
anthropological divides. It has 
some 350 members drawn 
from biological, cultural and 
archaeological specialities. 
Other interdisciplinary scholarly 
associations are The Human 
Behavior and Evolution Society, 
the European Human Behaviour 
and Evolution Association, and 
the Society for Anthropological 
Sciences. Each has hundreds 
of members active in the kind 
of research the authors claim is 
scarce or lacking. Productive 
interdisciplinary centres, such 
as the Centre for the Evolution 
of Cultural Diversity based at 
University College London, also 
catalyse innovative research that 
integrates biological, cultural and 
archaeological perspectives. 

We feel that a genuinely 
interdisciplinary field of human 
diversity is emerging, synthesizing 
ideas and data from the social 
and behavioural sciences with 
theory and modelling techniques 
from evolutionary biology and 
game theory. Unlike Kuper and 
Marks, we see ample evidence 
that this work features in 
current debates about cognition, 
altruism, economic behaviour 
and environmental degradation 
(see, for example, M. Borgerhoff 
Mulder et al. Science 326, 
682-688; 2009). 

Eric Alden Smith on behalf of 
30 co-signatories*, University of 
Washington, Seattle, USA. 
easmith@u.washington.edu 
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Michael Gurven University of 
California, Santa Barbara, USA. 
Monique Borgerhoff Mulder 
University of California, Davis, 
USA. *A full list of signatories is 
available online at http://dx.doi. 
org/10.1038/471448b. 


Anthropology: follow 
field primatologists 


Field primatology is one area 
of anthropology in which a 
classical cross-disciplinary 
approach is thriving (Nature 
470, 166-168; 2011). 

Field primatologists search 
the archaeological record of 
tool-using primates to gain 
insight into their cultures and 
traditions. Similarly, researchers 
of primate communication have 
set up a linguistic framework 
to investigate its intricacies in 
the context of the evolution of 
human language and music. 

Like Jane Goodall and Birute 
Galdikas, whose studies on 
the great apes could read as 
ethnographies of a human 
group, field primatologists 
embrace long-term participant 
observation, a hallmark of social 
anthropology. 

With the decline of natural 
forests, primate populations 
are nearly all intimately linked 
with their human neighbours. 
Field primatologists study their 
interactions, balancing the need 
for primate conservation with the 
cultural practices of the humans 
on whom the animals depend. 

They advise on issues such as 
bushmeat hunting, the pet trade 
and the evolution of diseases 
that affect both human and 
non-human primates. They join 
cultural anthropologists and 
local people in examining data 
on past distributions and recent 
local extinctions of non-human 
primates and other animals. 

In short, field primatology 
is successfully retaining 
and expanding the spirit of 
anthropology. 
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K. Anne-Isola Nekaris, 
Vincent Nijman Oxford Brookes 
University, Oxford, UK. 
vnijman@brookes.ac.uk 

Laurie R. Godfrey University of 
Massachusetts, Amherst, USA. 


Intolerance: UK chief 
scientist responds 


Andy Stirling and Brian Wynne 
(Nature 471, 305; 2011) call 
respectively for a democratic 
approach to scepticism and 

for recognition that scientific 
evidence often forms only part of 
complex decisions. I agree with 
them on both counts. 

Of course it is true that 
advancement is attained through 
criticism, scepticism and debate. 
But my point was that there can 
sometimes be a thin line between 
healthy scepticism and a cynical 
approach that ignores or distorts 
inconvenient evidence. 

Where significant consensus 
exists on an issue, this has not 
always been made obvious; 
also, tokenistic opposing views 
can be presented in a way that 
exaggerates their support. 

Clearly, the role of scientific 
evidence in decision-making 
must be considered in the wider 
political and social context. 
However, I make no apology for 
demanding that the fundamental 
evidence and weight of 
consensus in such cases is set out 
in a proper and fair way. 

John Beddington Chief Scientific 
Adviser to HM Government, 
Government Office for Science, 
London, UK. 
mpst.beddington@bis.gsi.gov.uk 


Negative results 
are published 


Jonathan Schooler argues in 
favour of an open-access database 
of negative results (Nature 470, 
437; 2011). But publishing such 
results in scientific journals is 


advantageous for authors, who 
can then list them among their 
papers. 

Several journals specifically 
publish negative results. 'm 
aware of the Journal of Negative 
Results in Biomedicine, the 
Journal of Negative Results — 
Ecology and Evolutionary Biology 
and the psychology Journal of 
Articles in Support of the Null 
Hypothesis. There is a forum 
in the Journal of Universal 
Computer Sciences for negative 
results, and PLoS ONE also 
publishes them. Several other 
such journals have come and 
gone; all, I think, are open access. 

Even so, negative findings are 
still a low priority for publication, 
so we need to find ways to make 
publishing them more attractive. 
Bob O’Hara Biodiversity 
and Climate Research Centre, 
Frankfurt, Germany. 
bohara@senckenberg.de 


Animal research: a 
personal lesson 


Had I been a participant in 
your survey on animal-rights 
activism (Nature 470, 452-453; 
2011), I would have replied that 
animal extremism once hada 
negative effect on me — but in an 
unexpected way. 

I worked for many years as 
a primate researcher studying 
animal models of abnormal 
development. Two years after 
the publication of Peter Singer’s 
Animal Liberation (New York 
Review/Random House; 1975), 
my lab was attacked and its 
rhesus monkeys released. The 
monkeys were all recaptured and 
none was seriously injured. I felt 
intimidated, insulted and furious 
at what I saw as anti-science 
stupidity. 

My anger was such that I 
did not give a thought to the 
possibility that the perpetrators 
might have been infected with 
deadly herpes B virus from the 
monkeys. I failed to alert the 
emergency departments in the 
area about this lethal possibility. 

For years, my fury blocked 
the self-reflection that is 
expected of any scientist who 
harms vulnerable animals for 
presumed human benefit. 
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I dismissed even reasonable 
ethical questions directed at 

me and my work. Eventually, 
however, I took up a fellowship 
at the Kennedy Institute of 
Ethics at Georgetown University 
in Washington DC, and at the 
National Institutes of Health 
Clinical Center, where I studied 
bioethics on the moral standing 
of animals. My intellect 

and sense of compassionate 
responsibility broadened; 
research ethics became my life's 
focus. 

Healthy debate about animal 
research and the ethical and 
scientific issues involved must 
be encouraged, even in the 
face of hostility. We must also 
remember that it is unreasonable 
and inaccurate to label 
everyone who opposes animal 
experiments as ‘extremists. 

John P. Gluck University of 
New Mexico, Albuquerque, New 
Mexico, USA. 

e-mail: jgluck@unm.edu 


Animal research: the 
peaceful approach 


In your articles on animal 
activism (www.nature.com/ 
animalresearch), there was no 
mention of the many individuals 
and organizations who work 


peacefully and legally to educate 
the public and policy-makers 
about the ethical and scientific 
issues surrounding the use of 
animals in research. 

At the American Anti- 
Vivisection Society, we seek 
to bring about meaningful, 
long-term change for animals 
in laboratories through the 
development and use of high- 
quality, non-animal-based 
teaching, testing and research. 

Founded in 1883, the society 
brings a long-term perspective 
on opposing views and tactics. 
Biomedical research lobby 
groups in the United States have 
for decades opposed modest 
improvements to animal welfare 
laws and convinced researchers 
that there is too much red tape 
surrounding animal work. Yet 
the use of the most common 
lab animals — rats and mice 
— remains unregulated in 
the United States, and there 
is almost no accountability to 
the public, even regarding how 
many of these animals are used. 

The same lobby groups 
attempt to sully the terms 
‘animal rights’ and ‘activists’ 
by amplifying the illegal and 
offensive actions of individuals 
who do not represent any of us 
(see, for example, go.nature. 
com/bxabrm). The reality 
is that ‘peaceful activists 


often drive public policy on 
social issues. This has been 

true for animal issues for 
several decades and includes 
improvements to the US federal 
Animal Welfare Act. 

Crystal Miller-Spiegel American 
Anti-Vivisection Society, 
Jenkintown, Pennsylvania, USA. 
cmillerspiegel@aavs.org 


Animal research: 
replacing the lab rat 


Your coverage of animal 
research (www.nature.com/ 
animalresearch) focuses on well- 
worn themes from proponents, 
but does offer a way forward. 

British biologist Peter 
Medawar predicted years 
ago that the use of animals in 
research would some day be 
completely replaced by more 
innovative methods (The Hope 
of Progress, Methuen; 1972). 
And Colin Blakemore, an ardent 
defender of animal research, 
has repeatedly stated that: 
“Everyone hopes that a time 
will come when no animal is 
used at all” To translate these 
congruous perspectives into 
action, we need to develop the 
kind of proactive strategies that 
you call for. 

The results of your poll 
(Nature 470, 452-453; 2011) 
indicate that some scientists 
might be ready to take this idea 
forward. Others are clearly not 
immune to the ethical tensions 
in animal research. Sadly, most 
feel that the polarized debate 
on animal research makes 
it difficult to express more 
nuanced views, presumably 
because they do not want to be 
perceived as giving ammunition 
to the extremists. 

Medawar’s vision to replace 
animal experimentation 
is a goal that is worthy of 
serious effort, for the sake of 
scientific innovation, ethical 
responsiveness and animal 
protection. We should not be 
deterred by either the scientific 
challenges or the actions ofa 
handful of extremists. 

Martin Stephens The Humane 
Society of the United States, 
Washington DC, USA. 
mstephens@hsus.org 
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Overcoming catalytic bias 


Metathesis reactions can be used to make carbon-carbon double bonds — bar one isomeric class. By using new catalysts 
and balancing out the stabilities of intermediates in the reaction, the elusive isomers can be made. SEE ARTICLE P.461 


DAESUNG LEE 


he alkene metathesis reaction, in which 

carbon-carbon double bonds (C=C 

bonds) are redistributed between alk- 
ene molecules, has had an enormous impact 
on chemical research and industry, as was 
recognized in 2005 with a Nobel prize’. But 
this remarkable reaction has an inherent 
limitation — it cannot generate the thermo- 
dynamically less stable isomers of the alkene 
products. On page 461 of this issue, Hoveyda 
and colleagues” report that they have over- 
come this problem using specially designed 
molybdenum catalysts. Their discovery opens 
up even more opportunities for the powerful 
metathesis reaction. 

In alkene metathesis, two alkenes couple 
together to form a new alkene product along 
with an alkene by-product (typically ethylene, 
CH,=CH,; Fig. 1a). Conventional methods for 
making C=C bonds usually involve unstable 
reactants; by contrast, metathesis uses rela- 
tively stable alkenes as substrates, together 
with a metal alkylidene catalyst — a compound 
that contains a metal-carbon double bond. But 
metathesis tends to generate the more thermo- 
dynamically stable isomers (E isomers) of the 
products, rather than the less stable Z isomers. 
What's more, metathesis reactions are reversi- 
ble, and Z alkenes are more prone to the reverse 
reaction than are E alkenes, reinforcing the 
tendency of metathesis to generate E alkenes. 
The preparation of Z alkenes by metathesis has 
therefore been a great challenge, except in a 
few cases’, 

Hoveyda and co-workers” have met this chal- 
lenge by using molybdenum alkylidene com- 
plexes as metathesis catalysts (Fig. 1b). These 
complexes overcome the inherent thermo- 
dynamic preference of the metathesis process, 
substituting it with a non-reversible reaction 
regime that selectively generates Z alkenes. 
The authors used a combination of electron- 
donating and electron-accepting ligands on 
the molybdenum atom of the complexes to 
improve catalytic performance, an effect that 
has been predicted by computational models’. 
Furthermore, they used a ‘monodentate’ ligand 
that makes the complexes structurally flexible, 
thereby allowing the catalysts to adapt to the 
structural strains imposed on them during 
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Figure 1 | Selectivity in cross-metathesis. a, Cross-metathesis (CM) reactions are intended to produce 
heterocoupled alkene products, but up to six possible products, including homocoupled ones, can 
actually form. E alkenes are generally produced, rather than the corresponding thermodynamically less 
stable Z alkenes. Ethylene is also produced as a by-product. A and B represent different chemical groups; 
Mis the metal centre of the catalyst. b, Hoveyda and colleagues’ report molybdenum (Mo) catalysts, one 
of which is shown here, that enable highly Z-selective CM reactions. Ph is a phenyl group; TBS is a tertiary 
butyldimethylsilyl group, (C,H,)Si(CH;),. ¢, Two possible configurations are shown for an intermediate 


that forms during CM reactions using Hoveyda and co-workers’ catalyst. Destabilizing interactions 
between A and B in the left-hand structure (which goes on to yield a Z-alkene product) are smaller 
than those between B and the aryloxide ligand in the right-hand structure (which yields the E alkene). 
The left-hand intermediate therefore forms preferentially, so that the CM reactions are Z selective. 


the critical catalytic steps. These combined 
effects make the complexes especially effective 
catalysts for metathesis. The Z selectivity of 
the catalysts seems to be a consequence of the 
bulky, freely rotating monodentate ligand — 
in the reaction intermediate that leads to the 
formation of an alkene product, this ligand 
orients other bulky chemical groups so that 
they adopt the Z rather than the E arrangement 
in the product (Fig. Ic). 

Although the new catalysts provided excel- 
lent reactivity and Z selectivity in metathesis 
reactions, the authors still needed to refine 
things further, especially for reactions involving 
two different alkenes. Such cross-metathesis 
(CM) reactions can generate up to six differ- 
ent products, because each reactant molecule 
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may react either with another molecule of 
itself (homocoupling) or with the other reac- 
tant (heterocoupling, the desired reaction), 
potentially yielding E or Z isomers in each 
case (Fig. la). For efficient heterocoupling, the 
reactivity of the reactants must be tuned appro- 
priately. In a previous CM reaction’ reported by 
the same research group, the authors tuned the 
reactivity of one of the reactants by using ring 
strain — an effect that destabilizes certain cyclic 
molecules. But no such obvious orchestration 
is possible in reactions of linear alkenes, such 
as those used in the present work’. 

Hoveyda and colleagues realized that, to 
obtain heterocoupled Z alkenes from their 
CM reactions, and to prevent side reactions 
that regenerate starting materials, they would 


have to obtain a delicate balance in the reac- 
tivities of the alkylidene intermediates formed 
during the reactions. To achieve this, they 
chose alkene reactants that included chemical 
groups that would bias the reactivities of the 
alkylidene intermediates through electronic 
and steric effects (repulsion that occurs when 
bulky groups are brought too closely together). 

The first reactants they used were enol 
ethers — compounds in which a C=C bond is 
connected to the rest of the molecule by an 
oxygen atom (see Fig. 2 of the paper’). The C=C 
bonds in enol ethers have a higher electron 
density than those in simple alkenes, which 
makes them more reactive for metathesis. 
The authors found that, for the best conversion 
of reactants to products and for maximum 
yields of Z alkenes, the number of moles of 
enol ether used in the reactions should be 
ten times that of the other alkene reactant. 
By this means, they obtained high yields of 
heterocoupled products in reactions of enol 
ethers with simple alkenes, with exceptionally 
good Z selectivity. 

Why are the reactions so successful? Dur- 
ing metathesis reactions, a key intermediate 
known as a methylidene is produced. This 
methylidene can initiate reverse-CM reac- 
tions with Z-alkene products and generate 
unwanted E isomers. But in Hoveyda and 
colleagues’ reactions, the methylidene reacts 
mainly with the enol ether reactants because 
there is an excess of them and because they 
are more reactive. What’s more, the electronic 
properties of the resulting alkylidene inter- 
mediate are such that it preferentially reacts 
with an alkene rather than with another 
electron-rich enol ether. Heterocoupling 
is therefore the main reaction, rather than 
homocoupling between enol ethers. 

The enol ether reactants used by the authors 
were cheap and readily available, but some enol 
ethers are much more expensive, in which 
case using a large excess of them would be 
undesirable. Fortunately, Hoveyda et al. found 
that unwanted methylidene formation can be 
minimized if the volatile ethylene by-product 
is removed by evaporation as it is produced. In 
this way, the authors maintained high Z selec- 
tivity in their reactions using just twice as 
much enol ether as alkene. They demonstrated 
the effectiveness of this protocol by using it to 
synthesize an unusual lipid containing a Z enol 
ether substructure. 

The second reactants examined by the 
authors were allylic amides — compounds 
in which an amide group is attached at the 
carbon atom adjacent to a C=C bond (see 
Fig. 3 of the paper’). Unlike enol ethers, the 
C=C bonds in allylic amides are electroni- 
cally similar to those in simple alkenes, mak- 
ing homocoupling between the amides a 
potential problem. But Hoveyda et al. used a 
compound that incorporates a bulky amide 
group, which prevents such homocoupling® 
— although homocoupling between the other 


alkene molecules in the reaction could still be 
a problem. Nevertheless, the authors obtained 
excellent yields and Z selectivities in their CM 
reactions of allylic amides with alkenes. 

The factor that determines the final prod- 
uct in these reactions is the relatively slow 
reverse CM of the heterocoupled product that 
is formed initially, compared with the faster 
reverse reaction of homocoupled alkenes. 
Removal of the ethylene by-product from the 
reactions was critical for efficient Z-selective 
CM, because this minimized the formation of 
methylidene species that convert Z isomers 
of products to their E isomers. The authors 
went on to use CM of an allylic amide with 
an alkene as a key step in the synthesis of the 
antitumour agent KRN7000. The use of this 
reaction allowed the agent to be made in just 
nine steps — the shortest synthetic sequence 
for the compound reported so far, underlining 
the effectiveness of the method. 

Several aspects of these new metathesis reac- 
tions’ remain to be further refined: the range 
of alkene substrates that can be used should 
be broadened beyond enol ethers and allylic 
amides, for example, and ways should be 
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found to avoid using an excess of one of the 
reagents. The conversion of starting materi- 
als to products in the reactions is currently 
not complete, so achieving complete conver- 
sion and higher yields of products without 
sacrificing the Z selectivity is also desirable. 
Nevertheless, these Z-selective CM reactions 
are highly promising and will potentially be 
of use for the preparation of numerous com- 
pounds, with far-reaching consequences for 
the future of metathesis chemistry. = 
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In vitro sperm 


maturation 


Anticancer therapies can impair male fertility. Whereas men can opt to freeze 
their sperm before treatment, young boys don’t produce mature sperm and so 
lack this choice. Work in mice offers hope for such patients. SEE LETTER P.504 


MARCO SEANDEL & SHAHIN RAFII 


he blueprint for producing mature, 

functional spermatozoa in a labora- 

tory dish — all the way from stem cells 
to flagellated sperm — has eluded reproduc- 
tive biologists for decades. With a view to the 
eventual in vitro production of human sperm 
for clinical use, the main criterion of nor- 
mal gamete function is the ability to support 
fertilization, with the subsequent development 
of normal offspring. Reporting on page 504 of 
this issue, Sato et al.! meet this challenge in 
mice. 

The process of spermatogenesis in mam- 
mals persists throughout almost all of adult- 
hood. It starts with spermatogonial stem cells, 
which differentiate from type A spermato- 
gonia into type B spermatogonia, and then into 
spermatocytes. The spermatocytes undergo 
meiotic cell division to form spermatids, and 
finally spermatozoa*. Complete maturation 
takes more than a month in most mammals. 

Previous efforts to nurture sperm in vitro 


have either managed to recapitulate only parts 
of this complex differentiation process or failed 
to demonstrate the production of normal, 
fertile offspring. Although the organ-culture 
methods developed in the 1960s allowed 
germ cells to progress to meiosis’, until now 
no methods could support the entire process. 

Sato et al.' reasoned that the orchestration of 
cell-maturation signals during spermatogen- 
esis should be achievable in a dish by providing 
nearly all of the cellular components found in 
the testes. To this end, they cultured testicular 
fragments to maintain the proper microenvi- 
ronment for cell differentiation. This involved 
suspending fragments of immature testis on 
a semi-solid support, such as agarose, and 
partially bathing them in liquid. This organ- 
fragment culture system — referred to as the 
gas-liquid interface method — balances the 
delivery of nutrients from the culture medium 
to the maturing cells with the need for efficient 
gas exchange to maintain spermatogenesis for 
more than two months. 

To visualize and score maturing germ cells 
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50 Years Ago 


“The great majority of school 
children are not only robust and 
healthy, but are taller and heavier 
than their predecessors’, states Sir 
John Charles, the chief medical 
officer of the Ministry of Education, 
in his report for 1958 and 1959... 
Only about five per cent of school 
children contracted a notifiable 
infectious disease in the years under 
review. Tuberculosis continued 

to decline. Poliomyelitis reached 
its lowest level for thirteen years, 
and vaccination against it was 
undertaken vigorously everywhere 
... The main cause of death among 
children is through accident ... In 
1958, 869 children of 5-15 years 
died from accidents, including 395 
from accidents involving motor 
vehicles, 174 from drowning, and 
58 from burns and scalds. 

From Nature 25 March 1961 


100 Years Ago 


Ihave just been told a very interesting 
story by Mr. James Day of this town. 
Many years ago he and his father ... 
noticed a fox come searching along 
the hedgerows ... they saw that 

he was collecting the sheep’s wool 
caught on the thorns and brambles. 
When he had gathered a large 
bunch he went down toa pool... 
and backed slowly brush first into 
the water, until he was all submerged 
except his nose and the bunch of 
wool, which he held in his mouth. 
He remained thus for a short time, 
and then let go of the wool, which 
floated away; then he came out, 
shook himself, and ran off. Much 
astonished at this strange proceeding, 
they tooka shepherd's crook... and 
pulled the wool out. They found 
that it was full of fleas, which, to 
save themselves from drowning, 
had crept up and up the fox’s brush 
and body and head and into the 
wool, and that thus the wily fox had 
got rid of them. T. McKenny Hughes 

From Nature 23 March 1911 


under different conditions, the authors used 
a marker for spermatogenesis’: their mice 
were genetically engineered to express green 
fluorescent protein (GFP) under the control of 
regulatory elements for genes that are activated 
only when germ cells progress into meiosis and 
beyond. As expected, the tissues that they col- 
lected from the testes of newborn mice showed 
little or no baseline expression of GFP. 

The authors had previously optimized* 
various parameters involved in in vitro sperm 
formation, including temperature and the 
choice of basic ingredients for the medium. 
Notably, fetal bovine serum (FBS) seemed 
to be an important component, because its 
absence resulted in negligible maturation — as 
evidenced by the lack of the GFP signal. 

In their present work, Sato et al.’ confirm 
the importance of FBS but, borrowing from 
the field of embryonic-stem-cell biology’, 
they find that an alternative to FBS known 
as knockout serum replacement (KSR) is 
even more effective. This finding is counter- 
intuitive, because KSR is commonly used to 
maintain stem cells in an undifferentiated state. 
A clue to the mechanism involved comes from 
the fact that the lipid-rich albumin component 
of KSR is itself highly effective in boosting 
differentiation. 

The authors used in vitro fertilization (IVF) 
techniques to demonstrate the authenticity of 
the sperm collected from their cultures: they 
obtained male and female offspring that were 
themselves fertile. 

The preservation of fertility is a major con- 
cern for patients requiring therapy, such as 
chemotherapy or radiation therapy, that can 
inadvertently destroy germ cells. In men, 
this problem can be mitigated by banking 
sperm before treatment. The solution is less 
straightforward in pre-pubescent boys. On 
the basis of pioneering work in animals by 
Brinster® and others, the idea of transplanting 
cryopreserved spermatogonial stem cells is a 
reasonable strategy, although it has not yet 
been rigorously assessed in humans. Further- 
more, the technology for the long-term culture 
and expansion of human spermatogonial stem 
cells has not been standardized, nor has the 
safety of the approach been tested. 

Sato and colleagues’ results suggest a viable 
alternative. In this scheme, boys would undergo 
testicular biopsy before chemotherapy or 
radiation therapy, to obtain tissue for cryo- 
preservation (Fig. 1). Ifinfertility occurs, the 
testicular fragments could be thawed and sperm 
obtained from organ culture for IVE. Such a 
protocol would bypass the need for surgical 
spermatogonial stem-cell transplantation. 

It remains unclear whether the success 
of this system is due to signalling molecules 
released by the germ cells themselves, or to 
molecules released by the surrounding somatic 
(non-germ) cells, or to both. Nonetheless, the 
integrity of somatic cells, especially Sertoli 
cells, seems to be essential. However, this is not 
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Figure 1 | Possible scenario for preserving 
fertility, based on the new work’. After a boy 

has been diagnosed with cancer, a biopsy would 

be performed to obtain testicular fragments for 
long-term cryopreservation. Later in life, when the 
individual wants to start a family, fertility would 

be assessed, and if he cannot produce functional 
sperm, the stored tissues could be thawed for organ 
culture. Sperm formed by this ex vivo method 
could then be used for in vitro fertilization (IVF). 


surprising, given that germ-cell maturation 
is known to depend on somatic-cell signals. 
In fact, even when embryonic stem cells have 
been used to produce germ cells in vitro’, sig- 
nals contributed by differentiating testicular 
somatic cells in the culture seem to be required. 

The exact nature of the external signals 
that enhance sperm maturation is not the 
only remaining mystery. It is also not known 
whether the resulting offspring — especially 
those produced from cryopreserved tissue — 
are generally healthy. Indeed, fertility of the 
offspring is just a crude indicator of whether 
gametes are ‘normal. Investigations should 
be made into whether the progeny Sato and 
co-workers generated by IVF are healthy in 
other ways (with respect to ageing, immune 
function, behaviour and so on). 

As for the consequences of in vitro sperma- 
togenesis at the molecular and cellular levels, 
previous data have indicated* that adverse 
epigenetic effects occur when cells, especially 
gametes, are maintained in culture. Whether 
DNA repair, which is essential for spermato- 
genesis in vivo, functions normally in vitro is 
another concern. Subtle genetic or epigenetic 
changes could be pivotal for the well-being of 
subsequent generations. These caveats aside, 


the organ-culture approach’ represents a cru- 
cial experimental advance along the thorny 
path to the clinical use of sperm developed 
in vitro. @ 
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Zooming in on a gene 


Genome-wide association studies are often criticized for providing little insight of 
immediate physiological relevance. The finding of one such study, which implicates 
a signalling molecule in schizophrenia, is welcome news. SEE LETTER P.499 


HUGH D. PIGGINS 


every 1,000 individuals’. Yet, as for most 

mental disorders, the origins and mech- 
anisms of this multi-symptom psychiatric 
illness remain elusive. In this respect, Vacic and 
colleagues’ report’ on page 499 of this issue — 
the first genome-wide association study to 
implicate a single gene in this complex disease 
— is enlightening. 

Throughout the late twentieth century, 
schizophrenia research focused mainly on 
alterations in the levels of the neurotransmit- 
ter dopamine and in particular of its receptors’. 
However, this proved to be only one piece of 
the puzzle. 

Hopes were again raised with the emergence 
of tools that can detect and map genetic varia- 
tions and alterations. One such modification, 
known as copy number variation (CNV), 
involves structural changes in the genome that 
result from disruptions, duplications or dele- 
tions of sections of DNA. Although research- 
ers tend to look for common CNVs that are 
associated with complex disorders, large-scale 
and sensitive genome-wide association studies, 
in which thousands of genes in patient sam- 
ples can be assessed, also allow the detection 
and mapping of rare CNVs and the genomic 
regions (loci) enriched in them. 

One idea that is rapidly gaining momentum 
is that the accumulation of rare CNVs is asso- 
ciated with a high risk of developing schizo- 
phrenia and other mental disorders such as 
autism’. Indeed, CNVs on several chromo- 
somal loci (microdeletions at loci 1q21.1, 
3q29, 15q13.3 and 22q11.2, as well as micro- 
duplication at the 16p11.2 locus) are associ- 
ated with the risk of schizophrenia*”. Together, 
however, these CNVs account for only 2-4% 


. chizophrenia affects between 3 and 7 in 


of schizophrenia cases and do not extensively 
contribute to our understanding of the mecha- 
nism of this disorder. 

Vacic et al.' carried out a genome-wide 
association study of 8,290 patients with 
schizophrenia. They find that 0.35% of these 
patients carry rare CNVs in the chromosomal 
locus 7q36.3. These rare copy number gains, 
or microduplications, were much less frequent 
(0.03%) among 7,431 healthy controls. It is 
particularly intriguing that all of the micro- 
duplications overlapped with — or occurred 
within 89 kilobases upstream of — a single 
gene, VIPR2. 

The VIPR2 gene encodes the VPAC2 recep- 
tor, which binds a brain chemical known 
as vasoactive intestinal peptide (VIP). The 
VIP-VPAC2 signalling pathway has been 
implicated in several neural and behavioural 
processes’, and is known to influence circadian 
rhythms’ and cognition®. Vacic and co-workers’ 
identification of several duplications in the 
protein-coding sequences at or near VIPR2 
suggests that alteration in VIP signalling is 
also a risk factor for schizophrenia. 

The activity of the VPAC2 receptor is posi- 
tively coupled with that of the intracellular 
enzyme adenylyl cylase. And stimulation of this 
receptor increases the intracellular levels of the 
signalling molecule cyclic AMP, which is gener- 
ated by adenylyl cyclase. To determine whether 
the VIPR2-associated CNVs have a physiologi- 
cal role, the authors” examined immune cells 
called lymphocytes obtained from subjects 
who had 7q36.3 microduplications and from 
control subjects. Specifically, they measured 
cAMP levels in response to the addition of VIP 
or another selective VPAC2 activator. Lym- 
phocytes from patients carrying the micro- 
duplications showed significantly higher levels 
of cAMP than did cells from controls. 
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The implication of cAMP-mediated signal- 
ling in schizophrenia is particularly noteworthy 
because this molecule has several crucial func- 
tions, including effects on neuronal activity and 
gene transcription’. Changes in VPAC2 activity 
are one route by which this signalling pathway 
might be affected in schizophrenia. Unfortu- 
nately, how 7q36.3 microduplications affect the 
structure of VPAC2 and its coupling to adenylyl 
cyclase remains unknown. 

VPAC2 is expressed throughout the nervous 
system, and so it is essential to identify where 
in the brain of patients with schizophrenia the 
microduplications in VIP2R are active and 
how they alter neuronal behaviour. Candidate 
sites include the amygdala, hippocampus and 
suprachiasmatic nuclei, which are respectively 
involved in emotion, learning and memory, 
and circadian rhythms. Interestingly, mice with 
targeted disruptions in VIP-VPAC72 signalling 
show alterations in neural processes affected in 
schizophrenia, including cognition’, responses 
to sensory stimuli” and sleep”. 

Vacic and colleagues’ study’ coincides with 
another investigation” that also points to 
CNVs at 7q36.3 as potential risk factors for 
schizophrenia. Thus, converging evidence sup- 
ports voluntary testing for CNVs at this locus 
to detect the likelihood of a person develop- 
ing schizophrenia. Of course, such an endeav- 
our is fraught with concerns over whether a 
person really wants to know that they are at 
risk of developing schizophrenia, or indeed 
over how this CNV influences particular 
symptoms associated with the disorder. 

These studies””’ also identify VPAC2 as a 
potential target for therapy. The distribution 
of this receptor in mammalian tissues is rea- 
sonably well mapped, and VPAC2-selective 
compounds have been, and continue to be, 
developed. Potentially, therefore, drugs could 
be tailored to regulate VIP-VPAC2 signalling 
in distinct regions of the brain. A possible, 
though surmountable, problem is that neuro- 
peptides other than VIP bind to VPAC2, and 
that VIP can signal through other receptors. 
Another cause for caution is that the relative 
incidence of CNVs at the 7q36.3 locus — 
although significant — is low and may account 
for only a small proportion of the risk. It is more 
likely, therefore, that VIP inhibitors could form 
part of a cocktail of drugs to treat the symptoms 
associated with schizophrenia. = 
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The soot factor 


The surface warming due to emissions of black-carbon aerosols over the second 
half of the twentieth century has been identified in observations. These findings 
will inform debate over the climatic effects of controlling such emissions. 


JOHANNES QUAAS 


nthropogenic emissions affect global 

Azim In a paper published in 

tmospheric Chemistry and Phys- 

ics, Jones et al.' add to understanding of this 

‘forcing’ of climate in a study of one of the 
factors involved — black carbon, or soot. 

Greenhouse gases, most prominently 
carbon dioxide, warm the planet by trapping 
infrared radiation. A less-recognized, but 
important, climate forcing is that caused by 
aerosols, small particles emitted during the 
combustion of fossil fuel and biofuel such as 
wood’. The additional anthropogenic aero- 
sols reflect sunlight, and also act as cloud- 
condensation nuclei — particles on which 
water vapour can condense and form clouds, 
which reflect more sunlight if they are formed 
from greater numbers of droplets. Both 
processes cool the Earth’s surface. 

Black carbon, however, is a special aerosol 
because it also strongly absorbs sunlight (Fig. 1) 
— precisely the reason it looks black. The 
immediate effect of this absorption of sunlight 
is to warm the atmosphere and cool the sur- 
face. But because the Earth system as a whole 


absorbs sunlight rather than reflecting it, the 
overall result is a warming. Clouds may serve 
as a medium that strongly enhances this effect: 
if the atmosphere is heated as a result of the 
absorption of radiation, clouds may dissolve, 
which would probably constitute a further net 
warming. Aerosols quickly settle through grav- 
ity and are washed out by precipitation. Their 
typical lifetime in the lower atmosphere is thus 
relatively short (up to one week), so they are 
concentrated near their sources. The pattern 
of the forcing by anthropogenic aerosols thus 
reflects their main source regions. 

Jones et al.' now identify the pattern of 
surface warming produced by anthropogenic 
black carbon in the observational record for 
the second half of the twentieth century. The 
analysis involved performing a set of climate- 
change simulations for the twentieth century, 
using a coupled atmosphere-ocean general 
circulation model and considering various 
combinations of individual forcings (green- 
house gases, aerosols including black carbon, 
ozone and land-use changes, as well as natu- 
ral forcings such as volcanic emissions and 
changes in solar irradiation). 

The authors split the temperature responses 


Figure 1 | Plume of soot aerosol overlying clouds above the eastern Atlantic Ocean. This image, taken 
by NASAs MODIS satellite instrument, shows a black-carbon plume (lower right) that emanated from 
biomass burning in west-central Africa. The aerosol absorbs radiation and so reduces the radiation being 
reflected back into space from the underlying clouds. Jones et al.’ show that anthropogenic emissions 

of black-carbon aerosols have warmed Earth’s climate to a statistically significant level over the period 
1950-99. (Faint vertical lines are lines of longitude; the image is about 2,200 kilometres across.) 
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as estimated using these simulations into four 
groups — temperature changes due to natu- 
ral forcings; changes due to anthropogenic 
greenhouse gases; changes due to other 
anthropogenic factors (aerosols other than 
black carbon, ozone, and land-use alteration); 
and changes due to black carbon. The ‘detec- 
tion and attribution’ method to identify a 
temperature signal due to a specific forcing was 
then applied to try to discern the correspond- 
ing simulated patterns of temperature change 
in the observational record. 

Focusing on the second half of the twentieth 
century (1950-99), Jones et al. find that natural 
forcing did not contribute significantly to the 
temperature trends, whereas all three groups 
of anthropogenic forcings did contribute sig- 
nificantly. This was previously known for the 
warming by greenhouse gases and the relative 
cooling by aerosols. But Jones et al. for the first 
time show that black carbon has warmed Earth 
by a statistically significant amount. 

Their study’ is particularly timely given the 
recent revival of discussion about whether 
control of black-carbon emissions might help 
to mitigate global warming. The scientific 
debate on this topic focuses on the net effect 
of a reduction in anthropogenic black carbon 
because, in addition to the warming caused 
by absorption of sunlight, black carbon may 
affect climate by other means. When mixed 
with other aerosols, black-carbon aerosols may 
serve as cloud-condensation nuclei, altering 
cloud reflectivity and probably having a cool- 
ing influence. By contrast, when deposited 
on snow, black carbon may act as a warming 
agent. The amount of sunlight reflected is 
reduced if a snow surface polluted by soot is 
less bright, an effect that may become stronger 
if the snow is heated so much that it melts. 

These two contrasting effects of black car- 
bon were not included in Jones and colleagues’ 
model. However, if either the cooling effect by 
enhanced cloud reflectivity was large enough to 
compensate for the warming due to the absorp- 
tion of sunlight plus potential dissolution of 
clouds, or the warming by pollution of snow 
was greater than the other effects, the simulated 
pattern of temperature change probably would 
not have been detectable in the observations. 
Thus, this study is also a notable contribution 
to the debate about which of black carbon’s 
impacts on climate is most important. 

In quantitative terms, considerable uncer- 
tainty remains. In Jones and colleagues’ study, 
this is most manifest in the fact that a substantial 
warming of 0.21 kelvin due to black carbon was 
detected for 1950-99, whereas itis only 0.14 kel- 
vin for a slightly shifted period, 1957-2006. The 
likely reasons are that this second period was 
more strongly affected by the 1991 eruption of 
Mount Pinatubo, and that the relative impor- 
tance of black carbon diminished after the late 
1990s in many regions. It is also noteworthy 
that other anthropogenic climate forcings 
had a much stronger impact on the global 


temperature — Jones et al. find a three to four 
times greater warming by greenhouse gases. 
In the context of mitigation of climate change, 
a further consideration is that black carbon is 
usually co-emitted with other aerosols, such as 
sulphate, that cool the climate. Reductions in 
total aerosol output might be desirable for pub- 
lic-health purposes. In the context of climate 
change, however, appropriate technical means 
would have to be applied to reduce the warming 
influence of black carbon but not the (probably 
larger) cooling effect of other aerosols. 
Finally, it is also necessary to remember 
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that anthropogenic aerosols, including black 
carbon, have a very short atmospheric lifetime 
compared with that of greenhouse gases. The 
gases typically have lifetimes of centuries and 
longer’, compared with days for aerosols. This 
implies that, if implementation of emission- 
reduction strategies were indeed to be feasible, 
climate-change mitigation by cutting black- 
carbon emissions could be effective fast. But 
it also suggests that the relative importance of 
black carbon will in any case gradually dimin- 
ish, given that greenhouse gases are long-lived 
and that they will continue to accumulate 


A mouse is not a Cow 


Early cell-lineage decisions during embryonic development differ between mice 
and cows. This finding calls for a re-examination of developmental variations 
across mammals, but does not undermine use of the mouse as a model organism. 


JANET ROSSANT 


he mammalian blastocyst is a thing of 

beauty. Over a period of a few days after 

the union of an egg with sperm, the fer- 
tilized egg divides to generate this tiny hollow 
sphere of cells, which has a cluster of enclosed 
cells at one end of the fluid-filled cavity. The 
outer cells are called the trophectoderm and 
the inner cells are, inventively, named the 
inner cell mass. But when do cells commit to 
becoming one or the other, and how? Writing 
in Developmental Cell, Berg et al.' 
show that the answers to these 


further down the trophectoderm lineage’. This 
suggests a model — albeit an overly simplistic 
one — whereby restricted expression of Oct4 
and Cdx2 leads to reciprocal repression of the 
opposing lineage and establishes cell fate. 
Berg et al.' asked whether this model applies 
to cell-fate decisions in cows. They find that, 
unlike in mice, Oct4 expression is not restricted 
only to the ICM during the early stages of 
cow blastocyst development. Instead, Oct4 is 
co-expressed with Cdx2 in the trophectoderm 
for some time after the beginning of blastocyst 
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in the atmosphere as long as anthropogenic 
emissions of these gases continue. = 
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formation. This observation is consistent with 
previous reports and has also been made for pig 
and human embryos (for example, see refs 4, 5). 
Even in the mouse, Oct4 expression overlaps 
with Cdx2 expression during the late cleavage 
and early blastocyst stages of embryonic devel- 
opment, and is restricted to the ICM only by the 
fully expanded blastocyst stage’. 

So why is Oct4 expression maintained for 
longer in the cow trophectoderm than in its 
mouse equivalent? Through experiments 
involving cow blastocysts engineered to express 
a fluorescently tagged version of mouse Oct4 
(the mouse Oct4—GFP transgene), Berg and 
co-workers show that the factors that restrict 
Oct4 expression to the ICM are not avail- 
able, or not functional, in the cow blastocyst 
(Fig. 1a). Cdx2 could be one such factor, but 
the authors’ data suggest that this protein has 
a role only later during cow embryonic devel- 
opment. However, Berg and colleagues do not 
investigate whether the role of Cdx2 in restrict- 
ing Oct4 expression is simply delayed in the 

cow embryo, nor whether Oct4 is 
ectopically expressed later during 


questions are not the same for mice ieareeere peers bl ane development in embryos treated to 
and cattle. Eee nines express reduced levels of Cdx2. 
Pluripotency — a cell’s ability The paper' shows that a mouse 
to differentiate into all cell types p Oct4-GFP transgene containing the 
of the body — is a common prop- bovine Oct4 regulatory elements is 
erty of the inner cell mass (ICM) expressed in both the ICM and 
of all mammalian blastocysts Resulatany ee trophectoderm in fully expanded 


and is always associated with the 
expression and function of the 
transcription factor Oct4. The 
trophectoderm, which later gener- 
ates all of the specialized layers of 
the placenta, also expresses a num- 
ber of lineage-restricted transcrip- 
tion factors, most notably Cdx2. 
In mice, deletion of either the 
Oct4 gene (also known as Pou5f1) 
or the Cdx2 gene leads to the for- 
mation of abnormal blastocysts: 
ICM cells of Oct4-mutant blas- 
tocysts express trophectoderm 
markers and lose pluripotency’, 
whereas the outer cells of Cdx2- 
mutant blastocysts express pluri- 
potency markers such as Oct4 
ectopically and fail to differentiate 


region 


Figure 1 | Oct4 regulation in mouse and cow blastocysts. a, Berg et al.’ 
find that the expression of a GFP fluorescent reporter transgene controlled 
by the regulatory elements of the mouse Oct4 gene (blue) is restricted to the 
inner cell mass (ICM) in mouse blastocysts but not cow blastocysts. b, The 
same transgene, but containing the bovine Oct4 regulatory elements (red), is 
not restricted to the ICM in either cow or mouse blastocysts. c, The authors 
narrow down this effect of bovine regulatory elements to the CR4 region. 


blastocysts of both the cow and 
the mouse (Fig. 1b). This suggests 
that Cdx2, which is active in mouse 
blastocysts, is not the only factor that 
affects the timing of Oct4 repression. 
It also indicates that bovine regula- 
tory elements do not respond to the 
factors that downregulate Oct4 in 
mouse blastocysts. 

Of the four evolutionarily con- 
served regulatory regions around 
the Oct4 locus, CR4 shows the most 
sequence divergence between the 
mouse and the cow. When Berg 
et al. replaced mouse CR4 with the 
cow version in the mouse Oct4- 
GFP construct, it behaved like the 
cow gene in the mouse blastocysts 
(Fig. 1c). Thus changes in both DNA 
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regulatory regions and the factors that bind to 
such sequences drive differences in the regula- 
tion of Oct4 expression between mouse and 
cow blastocysts. 

It would be interesting to test, in transgenic 
mice, whether regulatory elements of the 
human OCT4 gene behave like the mouse or 
the cow sequences. Although human blasto- 
cysts, like those of domestic animals, express 
Oct4 in the trophectoderm for an extended 
period compared with mice, the period of 
overlap of Cdx2 and Oct4 expression is only 
slightly longer than in the mouse. Human 
OCT4 is clearly restricted to the ICM by day 6 
before embryo implantation’®. 

But why do these regulatory differences exist 
among the blastocysts of different mammals? 
Evolutionarily, the placenta is a recent inven- 
tion, and still seems to be a work in progress. 
There is huge variation in trophectoderm 
and placental morphology across different 
mammalian species, accompanied by recent 
evolutionary divergence in placenta-specific 
gene families’. For example, a mouse blastocyst 
attaches and implants in the uterus by embry- 
onic day 5 (E5); a human blastocyst grows a 
little larger but then implants by E7-9 with 
highly invasive trophoblast outgrowth; and in 
cows, pigs and sheep the blastocyst floats in the 
uterus for 2-3 weeks before attaching. 

Berg et al. propose that such differences lead 
to earlier restriction of trophectoderm cell fate 
in the mouse than in the cow. Indeed, results of 
their experiments — involving chimaeric blas- 
tocysts generated by mixing trophectoderm 
cells from different stages of development with 
host embryos — support this proposal. 

Ina remarkable technical tour de force, they 
also transferred the chimaeric cow blastocysts 
to recipient cows and recovered them later in 
development to show that early trophectoderm 
cells can contribute to developing ICM deriva- 
tives. This is one of the first attempts to test the 
timing of lineage restriction in a species other 
than the mouse. 

This study emphasizes the need to explore 
the timing and mechanism of functional 
lineage restriction in blastocysts of different 
mammals, including humans. Differences in 
these parameters may underlie the known 
difficulty in deriving validated pluripotent 
embryonic stem cells and trophoblast stem 
cells from many mammalian species. Although 
fibroblasts have been reprogrammed into 
induced pluripotent stem cells in several 
domestic species, including the cow, these 
lines often depend on continued expression 
of exogenous reprogramming factors. Clearly, 
we need a better understanding of the control 
of pluripotency in all these species. 

As we learn more about the precise details 
of mouse blastocyst development, we must be 
constantly evaluating similarities and differ- 
ences between them and those of humans and 
other species. This will help us to truly under- 
stand mammalian embryo diversity. = 
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A fly in the face 


of genomics 


The modENCODE project uses integrative analysis to annotate genomic 
elements in the fruitfly and a nematode worm. The first fly data have now 
been published. SEE ARTICLES P.473 & P.480 & LETTER P.527 


EILEEN E. M. FURLONG 


he fruitfly Drosophila melanogaster is 

an exceptional model for dissecting 

the basic principles of biology, devel- 
opment and disease. It is amenable to genetic 
manipulation using tools developed over more 
than a century; and its genome shares exten- 
sive genetic content with humans. The first 
draft of the Drosophila genome was released 
a decade ago’, and with subsequent updates 
its annotation is in a ‘mature’ state. Neverthe- 
less, more than half of the predicted genes 
have been awaiting experimental verification 
of their structure — the location of promoter 
sequences, of boundaries of protein-coding 
and non-coding sequences, and of transcrip- 
tion termini. The modENCODE consortium 
project aims to address this issue and to iden- 
tify new genes and genomic elements in the 
fly genome’. Here I focus on the first wave of 
papers, including three in this issue*°, which 
describes the fly data so far. 

To determine which genes are expressed at 
specific stages of development, Graveley et al.’ 
(page 473) generated high-resolution expres- 
sion data, which are complemented by an 
analysis of 25 Drosophila cell lines®’. These 
efforts identified almost 2,000 new genes that 
encode proteins or non-coding RNAs. They 
also extensively refine existing annotation by 
describing more than 3,000 new promoter 
sequences’, roughly 53,000 new or revised 
exon sequences’, a threefold increase in RNA- 
splicing events’ and a tenfold increase in 
RNA-editing events’. Notably, most of the RNA- 
editing and -splicing events occur at precise 
stages of the Drosophila life cycle, indicating 
extensive temporal regulation of these post- 
transcriptional events by as-yet poorly under- 
stood mechanisms. This comprehensive view 
of the fly transcriptome*®” reveals that some 
75% of the organism's genome is transcribed at 
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one stage or another — in line with the wide- 
spread transcription observed in other species. 

Post-translational histone modifications 
covering a gene's promoter or coding region 
provide telltale signatures of the expression 
status ofa gene and thereby present another way 
to identify functional elements in the genome. 
Two of the modENCODE studies involved 
mapping such chromatin marks in Drosophila 
cell lines*andat 11 stages of its life cycle’. 

By examining the distribution of 18 histone 
modifications in two cell lines, Kharchenko 
et al.* (page 480) identified nine prominent 
chromatin signatures, which complement those 
defined previously’. Clues to their function 
come from information on chromatin acces- 
sibility and transcriptional activity, revealing 
chromatin signatures that distinguish between 
active and inactive genes, active promoters, 
and the location of new putative regulatory 
elements. The authors’ global analyses* extend 
previous studies”” indicating that the Poly- 
comb system — a group of chromatin-binding 
proteins traditionally associated with stable, 
long-term gene repression during embryonic 
development — can also function dynamically 
and associate with promoters that are actively 
transcribed or seem poised for activation. 

Deposition of chromatin marks is linked 
to the enzymatic activity of RNA polymerase 
during the initiation and elongation steps 
of transcription; this activity is regulated by 
transcription factors bound to cis-regulatory 
elements — proximal and distal sequences 
that affect gene expression. To understand 
how transcription is regulated, Négre et al.” 
(page 527) made a systematic effort to iden- 
tify all cis-regulatory elements by examining 
the occupancy of 38 transcription factors and 
other chromatin-regulatory proteins at dif- 
ferent stages of development. The result is a 
collection of around 20,000 putative regulatory 
elements that include insulators, enhancers and 


COSMOLOGY 


Lenses under the lens 


If you were to pop into a cosmology 
conference today, the chances are that 
you would see this image in at least one 
presentation. It is a striking snapshot of a 
cluster of galaxies acting as a gravitational 
lens: the cluster bends light from galaxies 
lying behind it and ‘smears’ the light to 
produce multiple images and giant arcs. 
As pretty as their effects are, gravitational 
lenses are giving cosmologists a few 
headaches. For example, the observed 
incidence of giant arcs and their distance 
from the clusters’ centres, which marks 
the size of features called Einstein rings, 
indicate that these clusters may have a 


promoters. Of the more than 2,000 putative 
promoters, 50% are already confirmed". The 
locations of about 14,500 putative cis-regula- 
tory elements were also identified. Unexpect- 
edly, one class of active promoters does not 
contain the characteristic chromatin mark 
H3K4me3, suggesting that the genes they regu- 
late use an alternative mode of transcriptional 
initiation. 

Integrating the binding patterns of all 
transcription factors leads to hypotheses of 
transcription-factor partnerships, involving 
co-binding to regulatory elements”. But over- 
lays of transcription-factor binding should be 
interpreted cautiously, particularly for factors 
with non-tissue-specific or partially overlap- 
ping expression: regions that are co-targeted by 
multiple factors are not necessarily co-bound 
in the same cells. Nevertheless, the complexity 
of some co-targeted regions is intriguing. The 
modENCODE researchers identified regions 
in the genomes of both Drosophila*’ and the 
nematode Caenorhabditis elegans'* — the other 
model organism on which the project focuses 
— that are highly occupied by transcription 
factors. It remains to be determined what func- 
tion, if any, such regions have in transcription. 

This first phase of modENCODE has made 
a significant impact on refining the annota- 
tion of the Drosophila genome, which forms 
the foundation of a large body of research 
conducted in this organism. But where should 
the project go from here? First, there is the 
issue of completion. With the new data, the 
annotation of genes may be 80% complete, 
but the job is far from over. Despite the huge 
depth of coverage, almost 1,500 known genes 
could not be identified in any experiments’. 
Analysis of specific subpopulations of cells 
and tighter staging of the developmental 
process should greatly improve sensitivity. 

Completing annotation of the ‘regulatory 
genome’ is much more challenging. Although 


stronger ‘lensing’ ability than expected in the 
framework of the currently accepted model of 
the cosmos. In a paper to appear in Astronomy 
& Astrophysics, Meneghetti et al. describe an 
analysis that advances our understanding of 
these systems (M. Meneghetti et al. Preprint at 
http://arXiv.org/abs/1103.0044; 2011). 

The authors compared the lensing ability 
of a numerically simulated sample of clusters 
with that of a sample of well-characterized, 
X-ray-luminous clusters obtained by 
the MAssive Cluster Survey (MACS). In 
contrast to earlier studies, their simulations 
factor in elements known to affect lensing 
power — for example, the fact that the 


the location of putative enhancer elements 
can be identified, determining which of these 
regions are functional, and when, is a huge 
task. Understanding the regulation of enhancer 
activity requires knowledge of which transcrip- 
tion factors are binding to them, in which cell 
types, and when. Scaling this up to the roughly 
700 predicted Drosophila transcription fac- 
tors is a monumental undertaking, but feasible 
given current tagging technologies’””*. 

A major drawback of the data sets is their 
lack of temporal and spatial resolution. 
Although cells in culture are extremely use- 
ful for identifying core properties of basic 
cellular processes, such immortalized cells, 
devoid of their developmental context, can- 
not substitute for cells within a developing 
embryo. On the other hand, whole-embryo 
studies provide merged signals from all cells 
in the embryo, giving no information on the 
tissue in which a gene, promoter or chromatin 
state is active. Many of the transcription factors 
examined are expressed across a broad range 
of tissues, which has the advantage of cover- 
ing a wide range of cis-regulatory elements. But 
merged transcription- factor occupancy signals 
from multiple tissues make it very difficult to 
disentangle regulatory connections and thus to 
build reliable regulatory networks. 

The general absence of functional informa- 
tion is perhaps the most serious limitation of 
the current work and a major challenge for all 
genomics projects. Such information is essen- 
tial to understand the relevance of regulatory 
connections. Examining mutants was under- 
standably beyond the scope of the present 
studies, but, moving forward, there is a clear 
need to integrate diverse types of functional 
data in order to make the transition from 
correlations to regulatory function. The 
thousands of Drosophila mutants available 
should provide a useful resource for this. 

We can view this work*~ as an important 
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lenses are complex three-dimensional 
structures. They found that the simulated 
clusters produce 50% fewer arcs than do 
the observed MACS clusters, and that the 
median size of Einstein rings differs by 25% 
between the two samples. These are much 
smaller discrepancies between theory and 
observation than previously reported. But as 
the authors themselves concede, more data 
are needed to confirm their findings. Ana Lopes 


chapter in a long book. The data — all freely 
available’ — provide an excellent resource for 
identifying putative genes and regulatory ele- 
ments that might be active at a particular stage 
of development. The sheer volume of new 
transcripts and putative regulatory elements, 
and the inherent complexity of their interac- 
tions, demonstrates how far the project has 
come, but also highlights the challenges that 
lie ahead to convert this wealth of informa- 
tion into regulatory networks that describe 
the transformation of a fertilized egg into a 
complex multicellular organism. To reach this 
goal, researchers must integrate new types of 
experiments that will address the function of, 
and connections between, genomic regions at 
high spatio-temporal resolution. With this in 
mind, we can envisage a next phase of exciting 
studies that will tackle these issues, and so look 
forward to seeing what comes next. m 
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Catalytic Z-selective olefin 
cross-metathesis for natural product 


synthesis 


Simon J. Meek, Robert V. O’Brien', Josep Llaveria', Richard R. Schrock? & Amir H. Hoveyda! 


Alkenes are found in many biologically active molecules, and there are a large number of chemical transformations in 
which alkenes act as the reactants or products (or both) of the reaction. Many alkenes exist as either the E or the 
higher-energy Z stereoisomer. Catalytic procedures for the stereoselective formation of alkenes are valuable, yet 
methods enabling the synthesis of 1,2-disubstituted Z alkenes are scarce. Here we report catalytic Z-selective 
cross-metathesis reactions of terminal enol ethers, which have not been reported previously, and of allylic amides, 
used until now only in E-selective processes. The corresponding disubstituted alkenes are formed in up to >98% Z 
selectivity and 97% yield. These transformations, promoted by catalysts that contain the highly abundant and 
inexpensive metal molybdenum, are amenable to gram-scale operations. Use of reduced pressure is introduced as a 
simple and effective strategy for achieving high stereoselectivity. The utility of this method is demonstrated by its use in 
syntheses of an anti-oxidant plasmalogen phospholipid, found in electrically active tissues and implicated in 
Alzheimer’s disease, and the potent immunostimulant KRN7000. 


Carbon-carbon double bonds reside within a large variety of molecules 
that possess desirable properties’, and catalytic cross-metathesis* (CM; 
Fig. 1) represents one of the most attractive approaches to stereoselec- 
tive preparation of these versatile functional groups. Through fusion of 
two terminal alkenes, available in ample quantities as by-products of 
petroleum purification or readily accessed by a variety of methods, 1,2- 
disubstituted alkenes can be obtained; the other product generated is 
gaseous ethylene. However, the only reported instances of Z-selective 
CM (65-90% Z) involve substrates with an sp-hybridized substituent 
(acrylonitrile or enynes**). In an efficient Z-selective CM, it is not only 
required that reaction between the two substrates proceed selectively 
(versus homocoupling), it must exhibit a preference for the thermo- 
dynamically less favoured stereoisomer (Fig. 1). The inherent reversi- 
bility of olefin metathesis (products can re-enter the catalytic cycle) 
and the higher reactivity of Z alkenes (versus E isomers) further exacer- 
bate the problem. Through careful consideration of various mech- 
anistic aspects of the process, conditions must be identified where 
the catalyst promotes CM but fails to react with the product Z alkene 
to effect equilibration, favouring the lower energy E isomer. 


Challenges of catalytic Z-selective cross-metathesis 


As the preliminary steps towards the eventual development of an effi- 
cient class of Z-selective CM reactions, we investigated two related— 
but much simpler—versions of the process. Alkylidenes and carbenes 
1-5 (Fig. 1) represent the catalyst classes used in our studies. 
Stereogenic-at-Mo 1a, 1b and 2°° were recently designed in these 
laboratories to promote enantioselective ring-closing metathesis; 
theoretical’ and experimental explorations® suggest that these com- 
plexes exhibit high activity partly as the result of stereoelectronic effects 
induced by the electron donor pyrrolide and acceptor monoaryloxide 
ligands. The fluxional nature of complexes such as la, 1b and 2, 
facilitated by the absence of rigid bidentate ligands, allows the metal 
alkylidenes to adapt to the structural strains imposed during the 


catalytic cycle. As a result, the stereogenic-at-Mo complexes are 
generally more effective olefin metathesis catalysts than other Mo- 
based complexes 3’ and 4’° or Ru carbene 5''. We thus established 
that alkylidene 2 readily catalyses Z-selective alkene formation 
through ring-opening/cross-metathesis (ROCM)” with strained 
oxabicyclic alkenes and styrenes. Homocoupling of terminal alkenes 
was subsequently shown to proceed with high efficiency and Z selec- 
tivity in the presence of members of the same catalyst class'*. The 
general mechanistic features that engender Z selectivity in the above 
reactions, and would be expected to do so in a CM process, are 
depicted in Fig. 1. The preference for Z alkene formation can be 
attributed to the ability of the large monodentate aryloxide to rotate 
freely (compare I in Fig. 1), causing the incoming alkene to be 
oriented such that its substituent (R2) is situated syn to that of the 
alkylidene (Rj). 

Designing a Z-selective CM is substantially more difficult: in a homo- 
coupling, only one type of alkene is involved and no more than two 
stereoisomeric alkenes can be formed; in contrast, there are two sub- 
strates in CM, which can generate up to six different products. In the 
case of a catalytic ROCM™, a strained cyclic alkene and a terminal 
alkene that are reluctant to undergo homocoupling (for example, a 
styrene) must be selected as substrates for the catalytic process to be 
efficient’. Transformations are carefully crafted such that the alkylidene 
derived from the terminal alkene favours association with the cyclic 
alkene (versus another of the same type) in the ring-opening stage, 
generating a new Mo complex that strongly prefers to react with a 
sterically less demanding terminal alkene (CM stage). The possibility 
of a transformation between the alkylidene generated through ring- 
opening and another strained—but more hindered—cyclic alkene is 
thus discouraged (that is, minimal homocoupling or oligomerization). 
Such deliberate orchestration is not feasible with catalytic CM, where 
both alkenes are mono-substituted and, in contrast to ROCM, there is 
no relief of ring strain to be manipulated. 
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Massachusetts 02139, USA. 


24 MARCH 2011 | VOL 471 | NATURE | 461 


©2011 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Z-Selective cross-metathesis of terminal alkenes 
Cross-metathesis Homocoupling 
products products 
GVA Gi Ge Gi Gi Gz Ge» 
=e — = Z alkenes Catalyst-induced kinetic control 
Catalyst higher in energy : is required for 
a than correspondin stereoselective Z-alkene synthesis 
Gy Gy G2 P 9 : badlale 
_ = = E alkenes (vs substrate-induced thermodynamic control) 
A Gy 
Go Gy Go 
re cl . F / \ 
R R i-Pr i-Pr Je MesN.,NMes 
N N N N 7 
a Sx.,| \aan I Cha, 
7 On “Ph 2 On Aw "Ph y lon An" "Ph ay N est “Ph ~ i SoI 
oO oO F3C 
Br~<=S_>7_ Br Br~<=S_ >] Br PAC ‘ - O1-Pr 
TBSO J) TBSO J) J Fs CFs 
1aR=Me 2 3 4 5 
1b R=i-Pr 
= N Rotation ‘Gr N NaS N — 
Yen... | around b> a | Ro Ils R, at oN! 
“Mo. Ry Mo-O bond Mow Ri Mo. Mo Or 
Br <=s 9 Bre 
TBSO 


Rotating and large 
monodentate aryloxide 
ligand generates a 
| significant steric presence 


ZAR, + F>Ro 


Figure 1 | A catalytic CM reaction can afford as many as six alkenes, so the 
challenge is designing an efficient process that favours formation of the 
cross products. Particularly difficult is the development of a process that 
affords the higher-energy Z alkene predominantly. To accomplish a Z-selective 
CM, a variety of catalysts were considered, such as stereogenic-at-Mo 
complexes (1, 2) or other previously reported Mo- and Ru-based complexes 


Z-Selective cross-metathesis of enol ethers 

We began by evaluating the ability of stereogenic-at-Mo complexes to 
promote transformations of enol ethers, a class of substrates for which 
a CM reaction has not been previously reported (E-or Z-selective); the 
resulting products have proven to be of utility in chemical synthesis 
and can be found in biologically active molecules (see below). In the 
presence of 2.5 mol% 1a, CM between 6 and 7 (entry 1, Table 1) 
proceeds to 85% conversion to afford disubstituted enol ether 8a in 
98% Z selectivity and 73% yield. With 1b, which bears a more sizeable 
2,6-di-i-propyl-arylimido unit, the reaction is completely Z-selective 
(>98% Z) but 47% conversion is achieved within the same time span. 
When alkylidene 2 is used, CM proceeds to 37% conversion and 


Table 1 | Examination of various catalysts for CM with an enol ether 


2.5 mol% 
r Mo or Ru complex Bn On-Bu 
ili ea CeHe, 22 °C =. 
6 7 ee 8a 
Entry no. Complex Time Conv. (%)* Yield (%)+ Z:E* 
1 la 2h 85 73 98:2 
2 1b 2h 47 ND >98:2 
3 2 2h 37 ND >98:2 
4 3 2h <2 NA NA 
5 4 10min 80 ND 47.5:52.5 
6 5 24h <2 NA NA 


The reactions were carried out in purified benzene under an atmosphere of nitrogen gas; 10 equiv. of 6 
was used (see Supplementary Information for details). NA, not available; ND, not determined. 

* Conversion (conv.) and Z:E ratios were measured by analysis of 400 MHz +H NMR spectra of 
unpurified mixtures; the variance of values is estimated to be <+2%. 

+ Yield of isolated product after purification; the variance of values is estimated to be <+5%. 
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(3-5). The structural flexibility of the stereogenic-at-metal complexes 1 and 
2 can give rise to exceptional reactivity, and free rotation around the Mo-O 
bond of these alkylidenes might serve as the basis for development of highly 
Z-selective olefin metathesis reactions of terminal alkenes. The sphere 
represents an appropriate size imido substituent. 


>98% Z-8a is generated; further transformation is not observed after 
six hours. Mo-based diolate 3 and Ru carbene 5 do not promote CM, 
and achiral Mo complex 4 catalyses a non-selective transformation 
(47.5% Z). Thus, stereogenic-at-Mo complexes prove to be effective in 
promoting enol ether CM, and although 1b or the less hindered 2 also 
afford exceptional stereoselectivity, neither delivers the efficiency of 
la. The 2,6-dimethylphenylimido 1a therefore offers the best balance 
between activity and stereoselectivity. Such performance variations 
may be observed because catalyst turnover is slower with the more 
sizeable 1b whereas the methylidene of the relatively unhindered 2 
(compare IV, Fig. 1) might suffer from a shorter life span. Consistent 
with the above scheme, 82% 8a is formed when CM with 1b is allowed 
to continue for 16 hours; in contrast, conversion with 2 after 10 minutes 
or two hours is nearly identical (~38%). 

There are several, mechanistically revealing, reasons for use of excess 
enol ether. CM generates a Mo-methylidene; this unhindered alkyli- 
dene can readily react with the Z-alkene product, reverse CM, cause 
equilibration and lower stereoselectivity. An enol ether reacts with a 
methylidene complex, circumventing diminution in Z selectivity. The 
more stable alkoxy-substituted alkylidene, generated from reaction ofa 
methylidene complex and an enol ether (I in Fig. 1 with R; = On-Bu), 
can undergo productive CM, giving rise to longer catalyst lifetime and 
improved turnover numbers. Furthermore, generation of the afore- 
mentioned alkoxy- or aryloxy-containing alkylidene means less of 
the alkyl-substituted derivative is formed and homocoupling of the 
aliphatic alkene is minimized. Owing to electronic factors, productive 
reaction between an enol ether-derived alkylidene and another 
O-substituted alkene is disfavored’. However, as use of excess enol 
ether is wasteful, we decided to examine the efficiency of the CM with 
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varying amounts of 6 (see Supplementary Information for details). 
The latter studies established that, although fewer equivalents of 6 
lead to reduced Z selectivity and competitive homocoupling, with 
5 equiv. of the inexpensive and commercially available enol ether, 
8a can be obtained in 93:7 Z:E selectivity and 71% yield (7% homo- 
coupled product). Excess enol ether 6 does not complicate product 
isolation, as this inexpensive reagent is volatile and can be easily 
removed in vacuo. 

Z-Disubstituted enol ethers are obtained in 57-77% yield through 
exceptionally stereoselective (94% to >98% Z) CM with Mo alkylidene 
la (Fig. 2). Alkyl- (8) or aryl-substituted (10) Z enol ethers as well as 
those that bear a carboxylic ester (8c), a secondary amine (8e), a brom- 
ide (10b) or an alkyne (10c) are readily accessed. Reactions with the 
more electron-deficient enol ether 9 and the relatively electron-rich 
alkenes proceed with 2.0 equiv. of the aryl-substituted enol ether; in 
contrast, 10 equiv. of alkyl-substituted and easily removable 6 are 
required for similar efficiency. Such variations probably occur because 
when 9 is used there is a better electronic match’ between the Mo- 
alkylidenes derived from the cross partners and either of the two 
alkenes, favouring CM versus homocoupling. Only 1.2 mol% la and 
2.0 equiv. of the p-methoxyphenylenol ether (for example, 10a, 10b and 
10d, Fig. 3c) are sufficient for an effective and exceptionally Z-selective 
CM to take place. 
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Synthesis of natural product C18 (plasm)-16:0 (PC) 


Next, we set out to demonstrate the utility of the catalytic CM process by 
a diastereo- and enantioselective synthesis of an anti-oxidant plasmalo- 
gen phospholipid, C18 (plasm)-16:0 (PC) (Fig. 2)'*’”, the correspond- 
ing E isomer of which has been shown to be less active'’. This initiative 
required addressing a challenge that is of general concern in catalytic 
CM: the inefficiency associated with the use of excess of one cross 
partner. The enol ether to be used (11) in the CM step is more valuable 
than the commercially available and inexpensive 1-octadecene (12), 
rendering utilization of excess amounts of the former unfavourable. 
Reducing the enol ether concentration diminishes efficiency and 
Z-selectivity, as detailed above and substantiated by the data in 
Table 2 (85% and 47% conversion with 5:1 and 1:1 11:12; entries 1 
versus 2). Larger quantities of the less valuable 12 could improve yield 
and selectivity, as Mo-methylidene concentration is probably lowered 
through its reaction with excess alkene. However, increased amounts of 
an aliphatic alkene, unlike an enol ether, give rise to homocoupling and 
ethylene generation. Ethylene, in addition to being detrimental to the 
rate of CM (because it competes with the substrates for reaction with the 
available alkylidene), causes diminished stereoselectivity by increasing 
methylidene concentration, which promotes Z alkene isomeriza- 
tion (see above). We thus surmised that, if the negative effects of the 
generated ethylene were to be attenuated by performing the reaction 
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C18 (plasm)-16:0 (PC) 


Figure 2 | Z-selective CM reactions of enol ethers with terminal alkenes and 
application to stereoselective synthesis of C18 (plasm)-16:0 (PC). Various Z 
enol ethers are synthesized with 1.2-5.0 mol% of Mo complex 1a, and typically 
require 2.0 equiv. (in the case of p-methoxyphenylvinyl ether) or 10.0 equiv. 
(with butylvinyl ether) of the terminal enol ether; excess butyl vinyl ether (6) is 
easily removed in vacuo. The desired Z alkenes are obtained in 51-77% yield 
and in 94% to >98% Z selectivity. Application to synthesis of C18 (plasm)-16:0 
(PC) demonstrates the utility of the Z-selective Mo-catalysed CM, which is used 
in conjunction with a site- and enantioselective Cu-catalysed dihydroboration 
of the terminal alkyne in 14 (see Supplementary Information for details). All 
reactions shown were performed under N, atmosphere; catalysts were prepared 


98:2 e.r., >98% Z 


and used in situ. Conversions and Z selectivities were determined by analysis of 
400 MHz 'H NMR spectra of unpurified mixtures; the variance of selectivity 
values is estimated to be <+2%. Yields of isolated products are shown (+5%). 
Reactions: for 8b—-8e, we used 2.5 mol% 1a and 10 equiv. 6; for 10a, 10b, we 
used 1.2 mol% 1a and 2.0 equiv. 9; for 10c, 10d, we used 5.0 mol% 1a and 10 
equiv. (10c) or 2.0 equiv. 9 (10d). See Supplementary Information for 
experimental details. Conditions for synthesis of 16. Route a, step 1; 2.5 mol% 
la, CeHg, 22 °C, 2.0h, decalin, 1.0 torr: step 2; 5.0 equiv. (n-Bu)4NF, THF, 

22 °C, 2h. Route b; 2.5 mol% 15, 2.5 mol% CuCl, 20 mol% NaOt-Bu, 2.1 equiv. 
bis(pinacolato)diboron, 3.0 equiv. MeOH, THF, 0 °C, 24h; 30% H,02, NaOH 
in aqueous THF, 1.0 h. 
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Figure 3 | Z-selective CM reactions of allylic amides with terminal alkenes 
and application to stereoselective synthesis of KRN7000. A range of Z-1,2- 
disubstituted allylic amides can be synthesized; in most cases, use of reduced 
pressure leads to substantially improved yield and stereoselectivity. Application 
to the stereoselective synthesis of KRN7000, involving catalytic 
diastereoselective dihydroxylation of the Z alkene obtained by Mo-catalysed 
CM, leads to an expeditious route for preparation of this biologically significant 
molecule (see Supplementary Information for details). All reactions shown 
were performed under N2 atmosphere with 3.0 mol% 2, 3.0 equiv. of the non- 


under vacuum, an efficient CM might be induced to proceed with only a 
relatively slight excess of the aliphatic alkene (12). Indeed, when cata- 
lytic CM is performed with an equal amount of 11 and 12 under 1.0 torr 
(entry 3, Table 2), efficiency (78% versus 47% conversion in entry 2) as 
well as stereoselectivity is substantially improved (97% versus 91.5% Z). 
With reduced pressure, 2 equiv. of 12 (versus 11) and decalin as solvent 
(to prevent precipitation of the homocoupled by-product causing cata- 
lyst sequestration), 89% conversion is observed in two hours and Z-13 is 
obtained with 97% selectivity (entry 4). 

Removal of the silyl group delivers stereoisomerically pure Z-14 in 
85% overall yield (Fig. 2); the desired product cannot be accessed through 
catalytic hydrogenation of the corresponding alkyne (see also 10c). Cu- 
catalysed site- and enantioselective dihydroboration’* furnishes 16 (in 
98:2 enantiomeric ratio, e.r.), which has been previously converted to 


Table 2 | Effect of reduced pressure on efficiency and Z selectivity 


a Si(i-Pr)3 Si(i-Pr)3 
Oo KF nf Wf 
+ 2.5 mol% 1a Hae eB 
Ha2C —_> 33U16 
33016. FH 12 29°C V_/ 13 

Entryno. 11:12 Time (h) Conv. (%)* Solvent Pressure Z:E* 
1 5:1 2 85 Benzene Ambient >98:2 
2 Il 2 47 Benzene Ambient 91.5:8.5 
3 is]. 2 78 Benzene’ _1.0torr 97:3 
4 1:2 2 88 Decalin 1.0 torr 97:3 


The reactions were carried out in purified benzene or decalin under an atmosphere of nitrogen gas (see 
Supplementary Information for details). 

* Conversion, Z:E ratios and the amount of the homocoupled product were measured by analysis of 
400 MHz 'H NMR spectra of unpurified mixtures; the variance of values is estimated to be <+2%. 
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87% overall yield 


N-containing alkenes (19b, 19c) or 5.0 mol% 2 and 10.0 equiv. of cross partner, 
7.0 torr, 5.0 hours, 22 °C; catalysts were prepared and used in situ. Conversions, 
Z selectivities, yields and Z:E ratios determined as in Fig. 1. For 19e, reduced 
pressure was not used; reaction performed at 50 °C for 12h. For 

19f, 19g, reaction time was one hour. See Supplementary Information for 
experimental details. Conditions for synthesis of 24: route a; 8.0 mol% 2 (in 
situ-generated), C5Hg, 22 °C, 5.0h, 1.0 torr. Route b; 5 mol% OsOu, 2.5 equiv. 
N-Me-morpholine oxide, CH,Ch, 0 °C, 24h. Route c; 10% trifluoroacetic acid, 
CH2Ch, 22 °C, 30 min. Route d; 1.2 equiv. 23, EtsN, THF, 50 °C, 12h. 


C18 (plasm)-16:0 (PC) in four steps and 86% overall yield’’. Two addi- 
tional points merit mention: (1) catalytic CM between 11 and 12 has 
been performed on the gram-scale with 1.0 mol% of in situ-generated la 
and 2 equiv. of 12, affording Z-13 with >98% stereoselectivity and in 
71% yield after purification (3h, 1.0torr, 79% conversion; see 
Supplementary Information for details). (2) The only previous synthesis 
of 16 involves nine transformations starting with (S)-isopropylidene 
glycerol (versus five reactions from (i-Pr)3Si-acetylene, Fig. 2) by a 
sequence that includes the use of highly toxic hexamethylphosphor- 
amide and a catalytic hydrogenation with lead-containing salts’”. 


Z-Selective cross-metathesis of allylic amides 

Another class of reactions that we examined involves allylic amides as 
substrates. Such catalytic CM reactions are of considerable value, as a 
large number of biologically active molecules are nitrogen-containing, 
and 1,2-disubstituted alkenes bearing a C-N bond at the allylic posi- 
tion can be functionalized in a variety of ways. Furthermore, in con- 
trast to transformations with enol ethers, CM with allylic amides 
poses the added complication that both substrates can undergo 
homocoupling. Preliminary investigations with enantiomerically 
pure allylic amide 17 (from commercially available alcohol) and 
1-hexadecene 18 indicated that the optimal catalyst for this class of 
processes is derived from adamantylimido complex 2, affording the 
desired Z alkene in 88% yield and with 97% stereoselectivity (entry 3, 
Table 3). Although arylimido derivatives 1a and 1b generate 19a with 
similar selectivity (entries 1 and 2, Table 3), reactions are inefficient 
(26-44% versus 88% conversion), perhaps because an alkylidene 
derived from 2 is less congested and can more readily promote CM 
of the relatively hindered 17. The higher efficiency of CM with 2, in 
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Table 3 | Examination of various catalysts for CM with an allylic 
amide 


Miphth) 3.0 mol% complex PlaaGia. (phn) 
~~ OTBS | CraHo yA ~ OTBS 
17 18 i Ws 5h, 19a 
(1.0 equiv.) (3.0 equiv.) 70 LOIr 
Entry no. Complex Conv. (%)* Yield (%)+ ZE* 
1 la 44 35 96:4 
2 1b 26 21 97:3 
3 2 93 88 97:3 
4 3 9 6 21:79 
5 4 71 68 12:88 
6 5 73 64 11:89 


The reactions were carried out in purified benzene under an atmosphere of nitrogen gas (see 
Supplementary Information for details). N(phth) = N-phthalamide. 

* Conversion and Z:E ratios were measured by analysis of 400 MHz 1H NMR spectra of unpurified 
mixtures; the variance of values is estimated to be <+2%. 

+ Yield of isolated product after purification; the variance of values is estimated to be <+5%. 


contrast to those involving enol ethers (Fig. 2), might be the result of 
CM with 17 being performed under vacuum, allowing minimal 
amounts of the relatively unstable methylidene to be formed. Chiral 
complex 3 is ineffective, and achiral Mo alkylidene 4 and Ru carbene 5 
furnish the E isomer predominantly (79-89%). A weaker vacuum 
(7.0 torr versus 1.0 torr CM with enol ethers) is sufficient, indicating 
that such conditions can be applied to cases that involve relatively 
volatile substrates. 

An assortment of allylic amides and terminal alkenes, including 
those that contain a halide (19b), a Lewis basic group (19c, 19d) or a 
sterically demanding substituent (19e), can be used (Fig. 3). Stereo- 
selective formation of 19f and 19g is noteworthy as the relatively less 
hindered unsaturated amides are more prone to homocoupling and 
the Z alkene products undergo equilibration to the E isomer readily, as 
manifested by the lower Z:E ratios. Although in certain cases 10 equiv. 
of a cross partner is used for maximum efficiency, lower amounts of 
alkene substrates lead to reasonably efficient processes. For example, 
with 3.0 mol% 2 and 3 equiv. of the aliphatic alkene (versus 5 mol% 
and 10 equiv.), 19g is isolated in 62% yield and 90% Z selectivity (80% 
conversion, 5 min, 22 °C). It should be noted that in all the above trans- 
formations, use of catalysts bearing a racemic binol ligand furnishes 
similar levels of reactivity and stereoselectivity (see Supplementary 
Information). 


Synthesis of immunostimulant KRN7000 

Stereoselective synthesis of anti-tumour agent KRN700 under- 
lines the utility of the method (Fig. 3). Catalytic CM of carbohydrate- 
containing allylic amide 20, prepared in four steps from commercially 
available agents, affords 21 in 85% yield and with 96% Z-selectivity. 
Diastereoselective dihydroxylation (89% yield, 92:8 diastereomeric 
ratio (d.r.)) of the Z alkene delivers 22; it should be noted that similar 
functionalization of the corresponding E alkene isomer would afford 
an undesired diol diastereomer”. Dihydroxylamide 24 is secured in 
two steps and the target is obtained after carbohydrate deprotection’. 
Z-Selective CM thus provides access to a route that is significantly 
more concise than the 14-step sequence (compared to nine steps in 
Fig. 3) reported thus far as the shortest synthesis of KRN7000™. It is 
noteworthy that the convergent nature of a synthesis approach invol- 
ving catalytic CM, such as the two examples provided here, can easily 
translate to preparation of a variety of related analogues; for example, 
in connection with preparation of 21 (Fig. 3), a wide range of other 
terminal alkenes may be used. 


(029.21 


The balance between conversion and Z selectivity 

The relationship between efficiency and stereoselectivity is critical and 
merits a brief discussion. The conversion values, at times less than 
complete, represent a balance struck between achieving the highest 
yield and maximal Z selectivity with minimal substrate equivalents 
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and adventitious homocoupling. Transformations performed under 
ambient conditions (no vacuum) may not proceed beyond 80% con- 
version, probably because the ethylene by-product competes with the 
remaining cross partner molecules. High ethylene concentration might 
also diminish CM rate through formation of relatively stable unsub- 
stituted metallacyclobutanes’. As mentioned before, methylidene com- 
plexes can engender reduction of Z selectivity; time-dependent studies 
indicate that stereoselectivities suffer with prolonged reaction times. 
With non-volatile substrates, if reactions are carried out under 
vacuum, complete consumption of the limiting alkene is observed only 
when excess amounts (~10 equiv.) of one cross partner are present. 
Under such regimes, however, difficulties associated with removal of 
the excess substrate and the homocoupled product might arise, ren- 
dering the use of lesser alkene amounts preferable. With lower sub- 
strate ratios, >98% consumption of the limiting substrate is difficult to 
achieve, as terminal alkene concentration is diminished as a result of 
partial homocoupling. 


Conclusions and discussions 


In addition to catalytic olefin CM, Wittig reactions”, catalytic alkyne 
hydrogenation and cross-coupling’”® are notable approaches for syn- 
thesis of Z-disubstituted alkenes. The above four types of transforma- 
tions are distinct—each delivers the desired product through a 
different bond disconnection. Similarly to CM, in cross-coupling 
alkenes serve as starting materials; in contrast to CM, however, it is 
through the synthesis of the substrate (for example, a Z vinyl halide) — 
and not in the cross-coupling step—that the stereochemical identity 
of the product is determined. Wittig-type processes are typically not 
catalytic and involve reaction of aldehydes (versus the more stable 
alkenes) and triphenylphosphonium ylides. Catalytic alkyne hydro- 
genation requires substrates derived from functionalization of a ter- 
minal alkyne; currently, relative to alkenes, methods for preparation 
of alkynes are less common and related synthesis routes are often 
lengthier. Moreover, partial hydrogenation of alkynes involves metal 
catalysts that contain poisonous lead salts and must be controlled to 
avoid over-reduction and generation of alkane by-products that can 
be difficult to separate from the desired Z alkene. Catalytic CM thus 
offers a desirable alternative to synthesis of Z alkenes, particularly as it 
requires as starting material a functional group that is stable, easily 
accessible and distinct from the other commonly used protocols men- 
tioned above. 

The strategies outlined here—including the use of reduced pressure 
to enhance stereoselectivity in catalytic CM, and the Z-selective Mo- 
catalysed transformations—offer a unique solution to a long-standing 
problem in organic chemistry”. Our findings offer additional 
evidence regarding the unique ability of stereogenic-at-Mo mono- 
aryloxypyrrolides to effect olefin metathesis reactions that extend 
beyond enantioselective processes”, with efficiency and selectivity 
levels that are not achievable with other catalyst classes. The catalytic 
processes described here are expected to affect significantly activities 
that require the stereoselective synthesis of organic molecules””®. 


METHODS SUMMARY 


General procedure for catalytic Z-selective cross-metathesis. In an N>-filled dry 
box, an oven-dried (135 °C) 20-ml vial equipped with a magnetic stir bar was 
charged with vinyl ether 11 (1.00 g, 4.19 mmol) and 1.0 mol% of in situ-generated 
complex la (419 ul, 0.100 M, 41.9 pmol; final substrate concentration = 1.70 M). 
A separate 2.0-ml vial was charged with 1-octadecene (12, 2.12 g, 8.39 mmol) and 
decalin (2.10 ml). The resulting solution was transferred to the mixture of 11 and 
la by syringe; a septum, fitted with an outlet needle, was attached to the vial and 
an adapter was attached to the top of the septum and vacuum (~1.0 torr) applied. 
The resulting solution was stirred for 3h. The vessel was removed from the dry 
box and the reaction was quenched by the addition of wet Et.O (~1.0 ml). The 
unpurified product is >98% Z (as determined by 400 MHz 'H NMR analysis). 
The residue was dissolved in Et,O and passed through a 2.5-cm plug of neutral 
alumina to remove inorganic salts, and the solution was concentrated. In a 25-ml 
round-bottom flask equipped with a stir bar, the resulting residue was treated 


24 MARCH 2011 | VOL 471 | NATURE | 465 


©2011 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


with (n-Bu),NF (1.0M in THF, 21.0 ml, 21.0 mmol), and stirred for 2h. The 
mixture was diluted with Et,O (200 ml), passed through a 5-cm plug of neutral 
alumina, and concentrated. The resulting white solid was purified by chromato- 
graphy on neutral alumina (100% hexanes) to afford 15 as a white solid (m.p. 30- 
31°C, 0.914 g, 2.98 mmol, 71.0% yield; >98% Z isomer). 


Received 25 January; accepted 22 February 2011. 
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Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report 
the massively parallel sequencing of 38 tumour genomes and their comparison to matched normal DNAs. Several new 
and unexpected oncogenic mechanisms were suggested by the pattern of somatic mutation across the data set. These 
include the mutation of genes involved in protein translation (seen in nearly half of the patients), genes involved in 
histone methylation, and genes involved in blood coagulation. In addition, a broader than anticipated role of NF-«B 
signalling was indicated by mutations in 11 members of the NF-«B pathway. Of potential immediate clinical relevance, 
activating mutations of the kinase BRAF were observed in 4% of patients, suggesting the evaluation of BRAF inhibitors in 
multiple myeloma clinical trials. These results indicate that cancer genome sequencing of large collections of samples will 
yield new insights into cancer not anticipated by existing knowledge. 


Multiple myeloma is an incurable malignancy of mature B-lymphoid 
cells, and its pathogenesis is only partially understood. About 40% of 
cases harbour chromosome translocations resulting in overexpression 
of genes (including CCNDI, CCND3, MAF, MAFB, WHSCI (also 
called MMSET) and FGFR3) via their juxtaposition to the immuno- 
globulin heavy chain (IgH) locus’. Other cases exhibit hyperdiploidy. 
However, these abnormalities are probably insufficient for malignant 
transformation because they are also observed in the pre-malignant 
syndrome known as monoclonal gammopathy of uncertain signifi- 
cance. Malignant progression events include activation of MYC, 
FGFR3, KRAS and NRAS and activation of the NF-«B pathway’®. 
More recently, loss-of-function mutations in the histone demethylase 
UTX (also called KDM6A) have also been reported’. 

A powerful way to understand the molecular basis of cancer is to 
sequence either the entire genome or the protein-coding exome, com- 
paring tumour to normal from the same patient to identify the acquired 
somatic mutations. Recent reports have described the sequencing of 
whole genomes from a single patient’°. Although informative, we 
hypothesized that a larger number of cases would permit the identifica- 
tion of biologically relevant patterns that would not otherwise be evident. 


Landscape of multiple myeloma mutations 


We studied 38 multiple myeloma patients (Supplementary Table 1), 
performing whole-genome sequencing (WGS) for 23 patients and 


whole-exome sequencing (WES; assessing 164,687 exons) for 16 
patients, with one patient analysed by both approaches (Supplemen- 
tary Information). WES is a cost-effective strategy to identify protein- 
coding mutations, but cannot detect non-coding mutations and 
rearrangements. We identified tumour-specific mutations by com- 
paring each tumour to its corresponding normal, using a series of 
algorithms designed to detect point mutations, small insertions/dele- 
tions (indels) and other rearrangements (Supplementary Fig. 1). On 
the basis of WGS, the frequency of tumour-specific point mutations 
was 2.9 per million bases, corresponding to approximately 7,450 point 
mutations per sample across the genome, including an average of 35 
amino-acid-changing point mutations plus 21 chromosomal rearrange- 
ments disrupting protein-coding regions (Supplementary Tables 2 and 
3). The mutation-calling algorithm was found to be highly accurate, 
with a true positive rate of 95% for point mutations (Supplementary 
Text, Supplementary Tables 4 and 5, and Supplementary Fig. 2). 

The mutation rate across the genome varied greatly depending on 
base composition, with mutations at CpG dinucleotides occurring 
fourfold more commonly than mutations at A or T bases (Sup- 
plementary Fig. 3a). In addition, even after correction for base com- 
position, the mutation frequency in coding regions was lower than 
that observed in intronic and intergenic regions (P< 1X 10 "*; 
Supplementary Fig. 3b), potentially owing to negative selective pres- 
sure against mutations disrupting coding sequences. There is also a 
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lower mutation rate in intronic regions compared to intergenic 
regions (P<1X 10 '°), which may reflect transcription-coupled 
repair, as previously suggested'’®''. Consistent with this explanation, 
we observed a lower mutation rate in introns of genes expressed in 
multiple myeloma compared to those not expressed (Fig. 1a). 


Frequently mutated genes 

We next focused on the distribution of somatic, non-silent protein- 
coding mutations. We estimated statistical significance by com- 
parison to the background distribution of mutations (Supplemen- 
tary Information). Ten genes showed statistically significant rates of 
protein-altering mutations (‘significantly mutated genes’) at a false 
discovery rate (FDR) of =0.10 (Table 1). To investigate their func- 
tional importance, we compared their predicted consequence (on the 
basis of evolutionary conservation and nature of the amino acid 
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Figure 1 | Evidence for transcription-coupled repair and functional 
importance of statistically significant mutations. a, Intronic mutation rates 
subdivided by gene expression rates in multiple myeloma. Rates of gene 
expression were estimated by proportion of Affymetrix Present (P) calls in 304 
primary multiple myeloma samples. Error bars indicate +1 standard deviation. 
NS, not significant. b, Functional importance (FI) scores were generated for all 
point mutations and divided into distributions for nonsignificant mutations 
(top histogram; n = 1,019) and significant mutations (bottom; n = 36). 
Comparison of distributions is via the Kolmogorov-Smirnov statistic. 
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change) to the distribution of all coding mutations. This analysis 
showed a dramatic skewing of functional importance (FI) scores’? 
for the ten significantly mutated genes (P= 7.6 X 10° '*; Fig. 1b), 
supporting their biological relevance. Even after RAS and p53 muta- 
tions are excluded from the analysis, the skewing remained significant 
(P<0.01). 

Wealso examined the non-synonymous/synonymous (NS/S) muta- 
tion rate for the significantly mutated genes. The expected NS/S ratio 
was 2.82 + 0.15, whereas the observed ratio was 39:0 for the significant 
genes (P < 0.0001), further strengthening the case that these genes are 
probably drivers of the pathogenesis of multiple myeloma, and are 
unlikely to simply be passenger mutations. 

The significantly mutated genes include three previously reported 
to have point mutations in multiple myeloma: KRAS and NRAS (10 
and 9 cases, respectively (50%), P<1 101, q<1x 10 °), and 
TP53 (3 cases (8%), P=5.1X10 °, q= 0.019). Interestingly, we 
identified two point mutations (5%, P= 0.000027, q = 0.086) in 
CCNDI (cyclin D1), which has long been recognized as a target of 
chromosomal translocation in multiple myeloma, but for which point 
mutations have not been observed previously in cancer. 

The remaining six genes have not previously been known to be 
involved in cancer, and indicate new aspects of the pathogenesis of 
multiple myeloma. 


RNA processing and protein homeostasis mutations 

A striking finding of this study was the discovery of frequent muta- 
tions in genes involved in RNA processing, protein translation and the 
unfolded protein response. Such mutations were observed in nearly 
half of the patients. 

The DIS3 (also called RRP44) gene harboured mutations in 4 out of 
38 patients (11%, P=2.4X 10 °, q = 0.011). DIS3 encodes a highly 
conserved RNA exonuclease which serves as the catalytic component 
of the exosome complex involved in regulating the processing and 
abundance of all RNA species'*"*. The four observed mutations occur 
at highly conserved regions (Fig. 2a) and cluster within the RNB 
domain facing the enzyme’s catalytic pocket (Fig. 2b). Two lines of 
evidence indicate that the DIS3 mutations result in loss of function. 
First, three of the four tumours with mutations exhibited loss of 
heterozygosity via deletion of the remaining DIS3 allele. Second, 
two of the mutations have been functionally characterized in yeast 
and bacteria, where they result in loss of enzymatic activity leading to 
the accumulation of their RNA targets'*’®. Given that a key role of the 
exosome is the regulation of the available pool of mRNAs available for 
translation”, these results indicate that DIS3 mutations may dysre- 
gulate protein translation as an oncogenic mechanism in multiple 
myeloma. 

Further support for a role of translational control in the pathogenesis 
of multiple myeloma comes from the observation of mutations in the 
FAM46C gene in 5 out of 38 (13%) patients (P=1.8 x 10°", 
q=1X10 °). There is no published functional annotation of 
FAM46C, and its sequence lacks obvious homology to known proteins. 
To gain insight into its cellular role, we examined its pattern of gene 
expression across 414 multiple myeloma samples and compared it to 
the expression of 395 gene sets curated in the Molecular Signatures 
Database (MSigDB), using the GSEA algorithm’*”°. The expression 
of FAM46C was highly correlated (q = 0.034 after multiple hypothesis 
correction; Fig. 2c) to the expression of the set of ribosomal proteins that 
are known to be tightly co-regulated*’. Strong correlation with eukar- 
yotic initiation and elongation factors involved in protein translation 
was similarly observed. Although the precise function of FAM46C 
remains unknown, this striking correlation provides strong evidence 
that FAM46C is functionally related in some way to the regulation of 
translation. Consistent with this observation, FAM46C was recently 
shown to function as an mRNA stability factor (M. Fleming, manu- 
script submitted). 
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Table 1 | Statistically significant protein-coding mutations in multiple myeloma 


Gene N n Untreated n CpG transition Other C:G transition C:G transversion A:T mutation Indel/ null P-value q-value 

NRAS 20,711 i] 3 6) 6) 3 6 ) 21,0: x 1077 <1.0 x10°° 
KRAS 25,728 10 6 0 5 1 4 ) <1.0x10-44 <1.0x10°° 
FAM46C 39,661 5 3 0 0) 2 1 2 1.8 x10°!° 1.0 x 10° 
DIS3 89,758 4 1 ) 1 1 2 ) 24x10 ° 0.011 

TP53 32,585 3 1 0 0) 1 1 1 5.1 x10°° 0.019 
CCND1 12,899 2 1 0) ) 0 2 ) 0.000027 0.086 
PNRC1 19,621 ea 2 ) 1 0 0 1 0.000039 0.094 
ALOX12B 40,369 3 0) 1 0) 1 1 6) 0.000042 0.094 
HLA-A 18,635 2 0) ) 6) 0 2 ) 0.000045 0.094 
MAGED1 53,950 2 1 ) 0) 0 0 2 0.000053 0.10 


Territory (N) refers to total covered territory in base pairs across 38 sequenced samples. Total numbers of mutations (n) and numbers of mutations occurring in therapy-naive disease (Untreated n) are shown for 


each gene. 


Notably, although not statistically significant on their own, we found 
mutations in five other genes related to protein translation, stability 
and the unfolded protein responses (Supplementary Table 6), further 
supporting a role of translational control in multiple myeloma. Of 
particular interest, two patients had mutations in the unfolded protein 
response gene XBP1. Overexpression of a particular splice form of 
XBP1 has been shown to cause a multiple-myeloma-like syndrome 
in mice, although no role of XBP1 in the pathogenesis of human 
multiple myeloma has been described”. 

Of related interest, mutations of the LRRK2 gene were observed in 3 
out of 38 patients (8%; Supplementary Table 6). LRRK2 encodes a 
serine-threonine kinase that phosphorylates translation initiation 
factor 4E-binding protein (4EBP). LRRK2 is best known for its role 
in the predisposition to Parkinson’s disease~***. Parkinson’s disease 
and other neurodegenerative diseases such as Huntington’s disease 
are characterized in part by aberrant unfolded protein responses”. 
Protein homeostasis may be particularly important in multiple myeloma 
because of the enormous rate of production of immunoglobulins by 


multiple myeloma cells****. The finding is also of clinical signifi- 
cance because of the success of the drug bortezomib (Velcade), which 
inhibits the proteasome and which shows remarkable activity in mul- 
tiple myeloma compared to other tumour types”. 

Together, these results indicate that mutations affecting protein trans- 
lation and homeostasis are extremely common in multiple myeloma (at 
least 16 out of 38 patients; 42%), thereby indicting that additional thera- 
peutic approaches that target these mechanisms may be worth explor- 
ing. 


Identical mutations suggest gain-of-function oncogenes 
Another way to recognize biologically significant mutations is to 
search for recurrence of identical mutations indicative of gain-of- 
function alterations in oncogenes. Two patients had an identical muta- 
tion (K123R) in the DNA-binding domain of the interferon regulatory 
factor IRF4. Interestingly, a recent RNA interference screen in multiple 
myeloma showed that IRF4 was required for multiple myeloma sur- 
vival, consistent with its role as a putative oncogene”’. Genotyping for 
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Figure 2 | Mutations likely to affect protein translation and/or homeostasis 
in multiple myeloma. a, Alignment of human, yeast and bacterial RNB 

domain of DIS3. Positions of observed mutations are indicated with respect to 
the human sequence. Yeast equivalents are, respectively, $541, V568, G833 and 


Rank in ordered data set 


Enrichment profile — Hits Ranking metric scores 


R847. b, One-dimensional and three-dimensional structures of yeast DIS3, 
with the RNB domain coloured in blue and mutations coloured in red. c, GSEA 
plot showing enrichment of ribosomal protein gene set among genes correlated 
with FAM46C expression in 414 multiple myeloma samples. 
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this mutation in 161 additional multiple myeloma samples identified 
two more patients with this mutation. IRF4 is a transcriptional regu- 
lator of PRDM1 (also called BLIMP1), and two of 38 sequenced 
patients also exhibited PRDM1 mutations. PRDM1 is a transcription 
factor involved in plasma cell differentiation, loss-of-function muta- 


tions of which occur in diffuse large B-cell lymphoma*'. 


Clinically actionable mutations in BRAF 


Some mutations deserve attention because of their clinical relevance. 
One of the thirty-eight patients harboured a BRAF kinase mutation 
(G469A). Although BRAF G469A has not previously been observed in 
multiple myeloma, this precise mutation is known to be activating and 
oncogenic**. We genotyped an additional 161 multiple myeloma 
patients for the 12 most common BRAF mutations and found muta- 
tions in 7 patients (4%). Three of these were K601N and four were 
V600E (the most common BRAF mutation in melanoma’’). Our 
finding of common BRAF mutations in multiple myeloma has 
important clinical implications because such patients may benefit 
from treatment with BRAF inhibitors, some of which show marked 
clinical activity**. Our results also support the observation that inhi- 
bitors acting downstream of BRAF (for example, on MEK) may have 
activity in multiple myeloma”. 


Gene set mutations: NF-«B pathway 


Another approach to identify biologically relevant mutations in mul- 
tiple myeloma is to look not at the frequency of mutation of individual 
genes, but rather of sets of genes. 

We first considered gene sets based on existing insights into the 
biology of multiple myeloma. For example, activation of the NF-«B 
pathway is known in multiple myeloma, but the basis of such activa- 
tion is only partially understood*’. We observed 10 point mutations 
(P= 0.016) and 4 structural rearrangements, affecting 11 NF-«B 
pathway genes (Supplementary Table 7): BTRC, CARD11, CYLD, 
IKBIP, IKBKB, MAP3K1, MAP3K14, RIPK4, TLR4, TNFRSFIA and 
TRAF3. Taken together, our findings greatly expand the mechanisms 
by which NF-«B may be activated in multiple myeloma. 


Gene set mutations: histone modifying enzymes 

We next looked for enrichment in mutations in histone-modifying 
enzymes. This hypothesis arose because of our observation that the 
homeotic transcription factor HOXA9 was highly expressed in a sub- 
set of multiple myeloma patients, particularly those lacking known 
IgH translocations (Supplementary Fig. 4a). HOXA9 expression is 
regulated primarily by histone methyltransferases (HMT) including 
members of the MLL family. Sensitive polymerase chain reaction with 
reverse transcription (RT-PCR) analysis showed that HOXA9 was in 
fact ubiquitously expressed in multiple myeloma, with most cases 
exhibiting biallelic expression consistent with dysregulation via an 
upstream HMT event (Supplementary Fig. 4b, c). Accordingly, we 
looked for mutations in genes known to regulate HOXA9 directly. 
We found significant enrichment (P = 0.0024), with mutations in 
MLL, MLL2, MLL3, UTX, WHSC1 and WHSCIL1. 

HOXA9 is normally silenced by histone 3 lysine 27 trimethylation 
(H3K27me3) chromatin marks when cells differentiate beyond the 
haematopoietic stem-cell stage***’. This repressive mark was weak or 
absent at the HOXA9 locus in most multiple myeloma cell lines (Fig. 3a). 
Moreover, there was inverse correlation between H3K27me3 levels and 
HOXA9 expression (Fig. 3b), consistent with HMT dysfunction contri- 
buting to aberrant HOXA9 expression. 

To establish the functional significance of HOXA9 expression in 
multiple myeloma cells, we knocked down its expression with seven 
shRNAs (Supplementary Fig. 5). In 11 out of 12 multiple myeloma cell 
lines, HOXA9-depleted cells exhibited a competitive disadvantage 
(Fig. 3c and Supplementary Fig. 6). 

These experiments indicate that aberrant HOXA9 expression, 
caused at least in part by HMT-related genomic events, has a role 
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Figure 3 | HOXA9 is a candidate oncogene in multiple myeloma. 

a, H3K27me3 enrichment at the HOXA9 promoter in CD34 cells, CD19 cells 
and multiple myeloma cell lines relative to H3K27me3 methylation at the BC 
site, known to be hypomethylated in all cells. b, Relative HOXA9 expression 
against H3K27me3 enrichment at the HOXA9 locus. c, GFP competition assay 
in multiple myeloma cell lines. After lentiviral infection with seven HOXA9 
shRNAs or a control shRNA targeting luciferase, GFP-positive cells were 
monitored by flow cytometry and compared to the proportion of GFP-positive 
cells present in the population 3 days after infection (designated day 0). Error 
bars indicate standard error of the mean and represent a minimum of three 
independent experiments. 


in multiple myeloma and may represent a new therapeutic target. 
Further supporting a role of HOXA9 as a multiple myeloma oncogene, 
array-based comparative genomic hybridization identified focal 
amplifications of the HOXA locus in 5% of patients (Supplementary 
Fig. 7). 


Discovering new gene set mutations 

We next asked whether it would be possible to discover pathways 
enriched for mutations in the absence of previous knowledge. 
Accordingly, we examined 616 gene sets in the MSigDB Canonical 
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Chr Start End Length (nt) Mut. Samples P-value q-value Separation (nt) Gene Coding events 
1 554350 555310 960 3 3 386x106 0.11 494, 44 AK125248 (intron) - 
1 82793220 82793300 80 2 2 839x106 0.19 8 TTLL7/LPHN2 (IGR) - 
1 47333070 47335140 2,070 4 3 247 x10° 0.09 350, 1,85 NBPFA (intron) = 
2 40865560 40865630 70 2 2 499x10° 0.14 2 SLC8A1/PKDCC (IGR) - 
3 49273920 49274010 90 2 2 480x106 0.14 78 ZIC4/AGTRI1 (IGR) - 
3 89142550 89143600 =1,050 8 5 555x104 39x10 298,8,17,26, BCL6/LPP(IGR) - 
26, 80, 1 
3 89440810 89441310 500 3 3 264x106 0.09 ,291 LPP (intron) - 
4 7819430 7819530 100 Z 2 8.01x10° 018 26 AFAP1 (intron) Missense mutation 
4 39875900 39876610 710 3 2 588x10° 0.16 09, 412 RHOH (intron) - 
4 62180540 62181370 830 3 3 105x10° 0.22 211, 432 LPHN3 (intron) - 
4 57902080 57904460 2,380 4 4 695x106 0.17 996, 423,443 PDGFC(3' UTR/intron) - 
7 92754250 92754270 20 2 2 2.03 x 10°’ 0.02 CCDC132 (intron) - 
9 6564360 6565100 740 3 2 865x106 0.19 250, 76 BNC2 (intron) - 
12 120943010 20943460 450 3 3 6.99 x10” 0.04 7,9 BCL7A (promoter) - 
12 120943580 20946950 = 3,370 4 3 147x108 0.0017 2055, 657,295 BCL7A(promoter/intron) - 
14 68327320 68333190 5,870 4 4 7.05x10° 0.17 397, 156, 35 ZFP36L1 (intron) Indel 
17 8106910 8111850 4,940 4 2 485x10° 0.14 483, 389,83 —PFAS (intron) Complex 
rearrangement 
20 60328960 60329510 550 2 2 1.42 x10°© 0.06 20 LAMAS (intron) Missense mutation 
Regions of predicted regulatory potential showing mutation frequency beyond that expected by chance are shown (q < 0.25). Mut., mutations. ‘Start’ and ‘End’ columns indicate the first and last nucleotide of 


regions of regulatory potential according to hg18/NCBI36. ‘Separation’ column indicates the number of nucleotides within the regulatory region separating the observed mutations. 


Pathways database. One top-ranking gene set was of particular inter- 
est because it did not relate to genes known to be important in mul- 
tiple myeloma. This gene set encodes proteins involved in the 
formation of the fibrin clot in the blood coagulation cascade. There 
were 6 mutations, in 5 of 38 patients (16%, gq = 0.0054), encoding 5 
proteins (Supplementary Table 8). RT-PCR analysis confirmed 
expression of 4 of the 5 coagulation factors in multiple myeloma cell 
lines (Supplementary Fig. 8). The coagulation cascade involves a 
number of extracellular proteases and their substrates and regulators, 
but their role in multiple myeloma has not been suspected. However, 
thrombin and fibrin have been shown to serve as mitogens in other 
cell types’, and have been implicated in metastasis**. These observa- 
tions suggest that coagulation factor mutations should be explored 
more fully in human cancers. 


Mutations in non-coding regions 


Analyses of non-coding portions of the genome have not previously 
been reported in cancer. We focused on non-coding regions with 
highest regulatory potential. We defined 2.4 X 10° regulatory poten- 
tial regions (Supplementary Fig. 9), averaging 280 base pairs (bp). We 
then treated these regions as if they were protein-coding genes, sub- 
jecting them to the same permutation analysis used for exonic regions. 

We identified multiple non-coding regions with high frequencies of 
mutation which fell into two classes (Table 2 and Supplementary 
Table 9). The first corresponds to regions of known somatic hyper- 
mutation. These have a 1,000-fold higher than expected mutation 
frequency, as expected for post-germinal centre B cells (Supplemen- 
tary Table 9). These regions comprise immunoglobulin-coding genes 
and the 5’ UTR of the lymphoid oncogene, BCL6, as reported**. 
Interestingly, we also found previously unrecognized mutations in 
the intergenic region flanking BCL6 in five patients, indicating that 
somatic hypermutation probably occurs in regions beyond the 
5’ UTRand first intron of BCL6 (Table 2). Whether such non-coding 
BCL6 mutations contribute to multiple myeloma pathogenesis 
remains to be established. 

The second class consisted of 18 non-coding regions with mutation 
frequencies beyond that expected by chance (q < 0.25) (Table 2 and 
Supplementary Table 10). Four of the 18 regions flanked genes that 
also harboured coding mutations. Interestingly, we observed 7 muta- 
tions in 5 of 23 patients (22%) within non-coding regions of BCL7A, a 
putative tumour suppressor gene discovered in the B-cell malignancy 
Burkitt lymphoma”, and which is also deleted or hypermethylated in 
cutaneous T-cell lymphomas***’”. The function of BCL7A is unknown, 
and the effect of its non-coding mutations in multiple myeloma 
remains to be established. 


Our preliminary analysis of non-coding mutations indicates that 
non-exonic portions of the genome may represent a previously 
untapped source of insight into the pathogenesis of cancer. 


Discussion 


The analysis of multiple myeloma genomes reveals that mechanisms 
previously suspected to have a role in the biology of multiple myeloma 
(for example, NF-«B activation and HMT dysfunction) may have 
broad roles by virtue of mutations in multiple members of these path- 
ways. In addition, potentially new mechanisms of transformation are 
suggested, including mutations in the RNA exonuclease DIS3 and 
other genes involved in protein translation and homeostasis. 
Whether these mutations are unique to multiple myeloma or are 
common to other cancers remains to be determined. Furthermore, 
frequent mutations in the oncogenic kinase BRAF were observed—a 
finding that has immediate clinical translational implications. 

Importantly, most of these discoveries could not have been made by 
sequencing only a single multiple myeloma genome—the complex 
patterns of pathway dysregulation required the analysis of multiple 
genomes. Whole-exome sequencing revealed the substantial majority 
of the significantly mutated genes. However, we note that half of total 
protein-coding mutations occurred via chromosomal aberrations 
such as translocations, most of which would not have been discovered 
by sequencing of the exome alone. Similarly, the recurrent point 
mutations in non-coding regions would have been missed with 
sequencing directed only at coding exons. 

The analysis described here is preliminary. Additional multiple 
myeloma genomes will be required to establish the definitive genomic 
landscape of the disease and determine accurate estimates of mutation 
frequency in the disease. The sequence data described here will be 
available from the dbGaP repository (http://www.ncbi.nlm.nih.gov/ 
gap) and we have created a multiple myeloma Genomics Portal 
(http://www.broadinstitute.org/mmgp) to support data analysis and 
visualization. 


METHODS SUMMARY 


Informed consent from multiple myeloma patients was obtained in line with the 
Declaration of Helsinki. DNA was extracted from bone marrow aspirate (tumour) 
and blood (normal). WGS libraries (370-410-bp inserts) and WES libraries (200- 
350-bp inserts) were constructed and sequenced on an Illumina GA-II sequencer 
using 101- and 76-bp paired-end reads, respectively. Sequencing reads were pro- 
cessed with the Firehose pipeline, identifying somatic point mutations, indels and 
other structural chromosomal rearrangements. Structural rearrangements affect- 
ing protein-coding regions were then subjected to manual review to exclude align- 
ment artefacts. True positive mutation rates were estimated by Sequenom mass 
spectrometry genotyping of randomly selected mutations. HOXA9 short hairpin 
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(sh)RNAs were introduced into multiple myeloma cell lines using lentiviral infec- 
tion using standard methods. 

A complete description of the materials and methods is provided in the Sup- 
plementary Information. 
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Drosophila melanogaster is one of the most well studied genetic model organisms; nonetheless, its genome still contains 
unannotated coding and non-coding genes, transcripts, exons and RNA editing sites. Full discovery and annotation are 
pre-requisites for understanding how the regulation of transcription, splicing and RNA editing directs the development 
of this complex organism. Here we used RNA-Segq, tiling microarrays and cDNA sequencing to explore the transcriptome 
in 30 distinct developmental stages. We identified 111,195 new elements, including thousands of genes, coding and 
non-coding transcripts, exons, splicing and editing events, and inferred protein isoforms that previously eluded 
discovery using established experimental, prediction and conservation-based approaches. These data substantially 
expand the number of known transcribed elements in the Drosophila genome and provide a high-resolution view of 


transcriptome dynamics throughout development. 


Drosophila melanogaster is an important non-mammalian model sys- 
tem that has had a critical role in basic biological discoveries, such as 
identifying chromosomes as the carriers of genetic information’ and 
uncovering the role of genes in development”’. Because it shares a 
substantial genic content with humans*, Drosophila is increasingly 
used as a translational model for human development, homeostasis 
and disease’. 

High-quality maps are needed for all functional genomic elements. 
Previous studies demonstrated that a rich collection of genes is 
deployed during the life cycle of the fly**. Although expression pro- 
filing using microarrays has revealed the expression of ~ 13,000 anno- 
tated genes, it is difficult to map splice junctions and individual base 
modifications generated by RNA editing? using such approaches. 
Single-base resolution is essential to define precisely the elements that 
comprise the Drosophila transcriptome. 

Estimates of the number of transcript isoforms are less accurate than 
estimates of the number of genes. Whereas ~20% of Drosophila genes 
are annotated as encoding alternatively spliced pre-emRNAs, splice- 
junction microarray experiments indicate that this number is at least 
40% (ref. 7). Determining the diversity of mRNAs generated by 
alternative promoters, alternative splicing and RNA editing will sub- 
stantially increase the inferred protein repertoire. Non-coding RNA 
genes (ncRNAs) including short interfering RNAs (siRNAs) and 


microRNAS (miRNAs) (reviewed in ref. 10), and longer ncRNAs 
such as bxd (ref. 11) and rox (ref. 12), have important roles in gene 
regulation, whereas others such as small nucleolar RNAs (snoRNAs) 
and small nuclear RNAs (snRNAs) are important components of 
macromolecular machines such as the ribosome and spliceosome. 
The transcription and processing of these ncRNAs must also be fully 
documented and mapped. 

As part of the modENCODE project to annotate the functional ele- 
ments of the D. melanogaster and Caenorhabditis elegans genomes'*"*, 
we used RNA-Seq and tiling microarrays to sample the Drosophila 
transcriptome at unprecedented depth throughout development from 
early embryo to ageing male and female adults. We report on a high- 
resolution view of the discovery, structure and dynamic expression of 
the D. melanogaster transcriptome. 


Strategy for characterization of the transcriptome 


To discover new transcribed features (Supplementary Table 1) and 
comprehensively characterize their expression dynamics throughout 
development, we conducted complementary tiling microarray and 
RNA-Seq experiments using RNA isolated from 30 whole-animal 
samples representing 27 distinct stages of development (Supplemen- 
tary Table 2). These included 12 embryonic samples collected at 2-h 
intervals for 24 h, six larval, six pupal and three sexed adult stages at 1,5 
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Figure 1 | Discovery of new RNAs in the Bithorax complex. Genomic 
organization and experimental evidence for new transcripts located between 
the HOX genes, abd-A and Abd-B, based on short poly(A)* RNA and total 
RNA-Seq expression profiles. The numbers to the left of each track indicate the 


and 30days after eclosion. We used 38-base-pair (bp) resolution 
genome tiling microarrays to analyse total RNA from all 30 biological 
samples and poly(A)* mRNA from the 12 embryonic samples (Sup- 
plementary Fig. 1). To attain single-nucleotide resolution and to facili- 
tate the analysis of alternative splicing and RNA editing, we performed 
non-strand-specific poly(A)’ RNA-Seq from all 30 samples generat- 
ing a combination of single and paired-end ~75-bp reads on the 
Illumina Genome Analyser IIx platform (short poly(A)* RNA-Seq) 
(Supplementary Table 3 and Supplementary Fig. 2). To identify 
primary transcripts and non-coding RNAs, the 12 embryonic time 
points were also interrogated with strand-specific 50-bp sequence reads 
from partially rRNA-depleted total RNA on the Applied Biosystems 
SOLiD platform (Supplementary Table 4 and Supplementary Fig. 3). To 
improve connectivity, mixed-stage embryos, adult males and adult 
females were used to generate ~250-bp reads on the Roche 454 plat- 
form (non-strand-specific long poly(A)" RNA-Seq) (Supplementary 
Table 5). In total, we generated 176,962,906,041 bp of mapped sequence 
representing 1,266-fold coverage of the genome and 5,902-fold coverage 
of the annotated D. melanogaster transcriptome. 


Discovery of new transcribed regions 


We identified 1,938 new transcribed regions (NTRs) not linked to any 
annotated gene models. Herein, ‘transcripts’ refer to RNA molecules 
synthesized from a genomic locus whereas ‘genes’ refer to one or more 
transcripts that share exons in their mature spliced form. modENCODE 
cDNAs fully support 13% of the NTRs (Supplementary Fig. 4) and 
partially support 23%. Most NTRs (84%) are detected by poly(A)* 
RNA-Seq, 44% by total RNA-Seq and 42% by tiling array. 
Approximately half of the NTRs are conserved in the distantly related 
Drosophila pseudoobscura and Drosophila mojavensis (Supplementary 
Fig. 4b) and 30% of these are detected by poly(A) * RNA-Seq data from 
D. pseudoobscura or D. mojavensis adult heads (Supplementary Fig. 4c, 
d, Supplementary Table 6 and Supplementary Methods). The NTRs 
probably eluded previous detection because they are expressed at low 
levels, in temporally restricted patterns, and are enriched for single-exon 
genes. The new multi-exon gene models (48%) have fewer, shorter and 
less conserved exons than annotated genes. 

Nearly one-third of the NTRs have a predicted open reading frame 
(ORF) greater than 100 amino acids. The remaining NTRs could 
encode small peptides but many are likely to be non-coding RNAs. 
A small fraction (9%) of NTRs are heterochromatic; most of these 
(232) have sequence similarity (greater than 100-nucleotide match 
and greater than 60% identity) to transposable elements (TEs) and 
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maximal number of reads for that sample. Three manually curated junction- 
based transcript models are shown; the green transcript model was fully 
validated by a cDNA, MIP06894. 


represent transcribed TEs or TE fragments. It remains to be deter- 
mined whether these regions have any function, although recent studies 
describe TE-associated regions that have acquired functions'*””. 

Even in the well-studied Bithorax complex’ we found an NTR. 
Known genetic breakpoints in the infra-abdominal regions iab-3 to 
iab-8, which lie between the homeotic genes abdominal A (abd-A) and 
Abdominal B (Abd-B), disrupt normal male development and affect 
fertility'*’°. Within this region are regulatory elements” and evidence 
for long non-coding RNAs that have eluded detection for over 
20 years*’**. We used the RNA-Seq data to infer the structures of at 
least three overlapping transcripts and validated one form (Fig. 1). 
The RNAs are expressed in embryos and adult males but not females. 
On the basis of the presumed role of this new gene and spatial expres- 
sion in the embryonic gonad (data not shown), we have named it male 
specific abdominal (msa). The cDNA contains short ORFs that are 
conserved in the melanogaster subgroup and could encode male- 
specific peptides. Whether they function as regulatory and/or as 
peptide-encoding RNAs is an important question for understanding 
development and segmental morphological diversity. 


Discovery of small ncRNAs 


We identified 37 unannotated intron-encoded and two unannotated 
intergenic small ncRNAs (<300 nucleotides) with an average frag- 
ments per kilobase of transcript per million fragments mapped 
(FPKM)* >20 from total embryonic RNA-Seq (Fig. 2 and Sup- 
plementary Table 7). Most of these ncRNAs are highly conserved in 
Drosophila sibling species*. We found published but unannotated 
ncRNAs: a U4atac snRNA” and four small Cajal-body-specific RNAs 
(scaRNAs)””. Of the remaining 34 ncRNAs, three are box C/D-like 
snoRNAs, 28 are box H/ACA-like snRNAs, one is a scaRNA-like 
RNA, and two are unclassified. One-third of these are located in the 
introns of genes encoding RNA-binding proteins, the majority of which 
are involved in pre-mRNA splicing (x16, SC35, tra2, Dek, Prp8, Tudor- 
SN, and pUf68). 


Discovery of microRNA primary transcripts 

MicroRNAs are processed from primary microRNA transcripts (pri- 
miRNAs) and are either independently transcribed or embedded in the 
introns of protein-coding genes. We identified 23 putative indepen- 
dently transcribed pri-miRNAs from the total embryonic RNA-Seq 
and tiling array data that encode 37 annotated miRNAs (Supplemen- 
tary Table 8). Only two primary transcripts were previously annotated 
(bft and iab-4). The pri-miRNAs range from 1 to 18 kb and terminate 
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Figure 2 | Discovery of small non-coding RNAs. a, Poly(A)* (yellow) and 
total RNA (blue) data from 10-12-h embryos are shown for the gp210 gene 
which hosts a representative new snoRNA. The maximal number of reads in the 
poly(A) and total RNA-Seq data are shown on the left and right of the track, 
respectively. b, The predicted RNA secondary structure of snoRNAgp219 is 
characteristic of a H/ACA-box snoRNA. Nucleotides that are 100% conserved 
in sequence or base-pairing are indicated in green and blue, respectively. 

c, Embryonic expression of the new small RNAs. The scale bar indicates FPKM 
Z-scores. unsRNA, unclassified small RNA. 


at the mature miRNA (pre-mir-315, Supplementary Fig. 5a). Twelve of 
the 23 precursors have cap analysis of gene expression (CAGE) peaks 
that map at their initiation sites**. pri-miRNA expression is dynamic in 
embryonic development (Supplementary Fig. 5b). 


Overview of the Drosophila transcriptome 


We calculated expression levels of annotated genes, transcripts and 
NTRs (Supplementary Table 9) in the short poly(A)" RNA-Seq and 
tiling array data sets. From the RNA-Seq data we detected expression 
of 14,862 genes (Supplementary Fig. 7a) and 36,274 transcripts 
(Fig. 3a) with an FPKM >1 (Supplementary Tables 9-18) of which 
67% of genes and 58% of transcripts were also observed in the array 
data (score >300) (Supplementary Fig. 6 and Supplementary Tables 
19 and 20). This includes the confirmation of 87% of annotated genes 
and transcripts and the discovery of 17,745 new transcripts. In addi- 
tion, from the total RNA-Seq data we detected expression of 12,854 
genes and 32,139 transcripts with an FPKM >1 (Supplementary 
Tables 12, 13,21 and 22) of which 77% of genes and 89% of transcripts 
were also observed in the array data. Of the genes and transcripts 
observed exclusively in the total RNA-Seq data, 519 genes and 
1,005 transcripts (primarily noncoding) were previously annotated 
and 122 genes and 1,422 transcripts are new discoveries. The genes 
and transcripts not detected in any data set include small genes 
(<200 bp), members of multi-copy gene families such as ribosomal 
RNAs, paralogues (expected owing to our mapping parameters), 
genes known to be expressed at low levels or in small numbers of cells 
(for example, gustatory and odorant receptor genes), and non- 
polyadenylated transcripts. 


Expression dynamics 

We examined the dynamics of gene expression throughout development 
using the short poly(A)* RNA-Seq data. The numbers of expressed 
genes (FPKM >1) (Supplementary Fig. 7a) and transcripts (Fig. 3a) 
gradually increases, from 7,045 (0-2 h embryos) to 12,000 (adult males). 
Adult males express ~3,000 more genes than adult females, consistent 
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Figure 3 | Dynamics of gene expression. a, Transcripts expressed (FPKM 
>1) in the short poly(A) ~ RNA-Seq data: FlyBase 5.12, blue; modENCODE, 
purple. The bar graphs indicate the number of transcripts expressed in each 
sample (Supplementary Table 1); the lines indicate the cumulative number of 
expressed transcripts. The lighter blue and purple lines indicate the cumulative 
number of transcripts expressed in the embryonic total RNA-Seq samples. The 
horizontal dotted lines indicate the number of expressed previously annotated 
transcripts. F, female; M, male. b, Scatter plot of sex-biased gene expression. 
Light red, female-biased annotated (n = 960); dark red, female-biased NTRs 
(n = 12); light blue, male-biased annotated (n = 2,401); dark blue, male-biased 
NTRs (n = 431); light grey, unbiased annotated (n = 8,217); black, unbiased 
NTRs (n = 136). c, Genome coverage. For each developmental sample, the 
short poly(A)* reads were used to estimate the percentage of the genome 
covered using a cutoff of two reads. The mature and primary transcripts were 
inferred for the previously FlyBase 5.12 (dotted lines) and modENCODE (solid 
lines) gene models. 


with the known transcriptional complexity of the testis’. We observed 
that 40% of expressed genes are constitutively expressed in 30 samples 
(Supplementary Fig. 7b). We also observed developmentally regulated 
expression of TEs (Supplementary Materials and Supplementary Fig. 8). 

We observed pronounced expression changes in over 1,500 genes 
in the first two third instar larval samples (Supplementary Fig. 7a, c). 
Expression of 1,199 genes increased at least tenfold, and 421 genes 
decreased at least tenfold (Supplementary Table 23). Nearly all of the 
upregulated genes are expressed for the first time during the third 
instar stage and most are poorly characterized genes. 

The earliest known event in metamorphosis is the ‘mid-3rd transi- 
tion’, identified by the synchronous changes in the transcription of a 
number of well studied genes, Ecdysone-induced protein 28/29kD and 
Fat body protein 1 (reviewed in ref. 31), and the switch from proximal 
to distal promoters of Alcohol dehydrogenase”. These markers coincide 
with the surge reported here. The mid-3rd transition has no morpho- 
logical or behavioural correlates and is associated with a pulse of the 
steroid hormone ecdysone** acting through a non-standard receptor™. 
Whether the onset of testis development is a consequence of the mid- 
3rd transition, or whether the two events are functionally related, 
remains to be investigated. 

Over 29% of protein-coding genes showed significant sex-biased 
expression in adults (false discovery rate <0.1%), with more male- 
biased (1,829) or male-specific genes (572) than female-biased (945) 
or female-specific genes (15) (Supplementary Tables 24 and 25, and 
Fig. 3b). Known female (ovo and otu) and male (dj) sex-biased genes 
were expressed as expected. We found that 74% of the NTRs expressed 
in adults were significantly male-biased whereas only 2.1% were sig- 
nificantly female-biased. 
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Table 1 | Classification of alternative splicing events 


Splicing event Diagram FlyBase r5.12 modENCODE New events Short poly(A)” RNA-Seq Significantly changing 
Cassette exons meee 793 2,717 2,014 2,369 1,539 
Alternative 5’ splice sites —= 843 5,192 4,599 4,583 3,142 
Alternative 3’ splice sites =< 879 6,253 5,505 5,579 3,242 

Mutually exclusive exons meee 229 251 123 228 226 
Coordinate cassette exons meee 301 1,227 979 992 467 
Alternative first exons —_ 1,767 4,936 3,442 4,473 3,996 
Alternative last exons ma 227 604 432 553 471 
Retained/unprocessed introns SS 1,434 2,679 (5,667) 1,275 (4,263) 2,439 (35,641) 868 (8,998) 
Total 6,437 23,859 (26,847) 18,369 (21,478) 21,216 (54,418) 13,951 (22,081) 


The number of retained/unprocessed introns in parentheses indicates the total number identified, whereas the number not in parentheses indicates the subset of identified events that have been validated by 


cDNA sequences or FlyBase 5.12 annotations. 


Genome coverage 


Mature mRNAs are encoded by 20% of the D. melanogaster genome 
and primary transcripts by 60% (Fig. 3c). An additional 15% of the 
genome (~75% total) is detected when considering all of the short 
poly(A) * RNA-Seq data. However, as greater than 99% of the reads 
map within the bounds of the transcript models, the reads that map to 
intergenic regions constitute a small minority of our data. Thus, 
although pervasive transcription of mammalian genomes has been 
observed in microarray studies*, we found little evidence of such 
‘dark matter’*® (that is, pervasive transcription) in D. melanogaster. 


Discovery and dynamics of alternative splicing 

To characterize constitutive and alternative splicing, we identified 
71,316 splice junctions, of which 22,965 were new discoveries. Of the 
new splice junctions, 26% were supported by multiple experimental 
data types and 74% by only one data type, (Supplementary Fig. 9a) 
primarily short poly(A)* RNA-Seq. Of the 20,751 new junctions from 
the short poly(A)" RNA-Seq data, 7,833 were incorporated into new 
transcript models or transcribed regions (NTRs). The remaining new 
junctions have yet to be incorporated into transcript models. 

We also identified a total of 102,026 exons (Supplementary Table 
26). Of the 52,914 representing new and revised exons, 65% were 
validated by capture and sequencing of cDNAs and 2,586 were sup- 
ported by RNA-Seq data from D. mojavensis and D. pseudoobscura. 
Of the new exons, 3,392 were identified from the new splice junctions 
but have yet to be incorporated into transcript models. 

To examine splicing dynamics throughout development, we cate- 
gorized all splicing events into the common types of alternative splicing 
events (Table 1). We identified a total of 23,859 splicing events, of 
which 18,369 were new or recategorized, a threefold increase from 


annotated splicing events. An additional 2,988 retained/unprocessed 
introns were identified that were supported by only one experimental 
data type. In all, 7,473 genes contain at least one alternative splicing 
event, which is 60.7% of the 12,295 expressed multi-exon genes—also a 
threefold increase in the fraction of genes with alternatively spliced 
transcripts. Although smaller than the fraction of human genes with 
alternatively spliced transcripts (95%)°’**, a larger proportion of 
Drosophila genes encode alternative transcripts than was previously 
known. 

Of the new alternative exons, 8,226 were previously annotated as 
constitutive. As observed’, annotated cassette exons, and their flank- 
ing introns, are more highly conserved than annotated constitutive 
exons (Fig. 4a). The newly discovered cassette exons are more highly 
conserved than the new constitutive exons, although both classes are 
less conserved than the corresponding class of annotated exons. New 
cassette exons that were previously annotated as constitutive exons 
are the most highly conserved set of exons (Fig. 4a). Annotated and 
new cassette exons show a strong tendency to preserve reading frame 
(Supplementary Fig. 9b), indicating that these transcripts increase 
protein diversity. Both annotated and new cassette exons tend to be 
shorter than their constitutive counterparts, although both sets of new 
exons tend to be shorter than annotated exons. 

To assess the extent of splicing variation we calculated the ‘per cent 
spliced in’ or Y (ref. 38) for each splicing event in each sample as well as 
the switch score (A) by determining the difference between the highest 
and lowest Y values across development (AY = Pinax — Yimin). This 
revealed a very smooth distribution of AY among all events, indicating 
that the splicing of most exons is fairly constant whereas only a minority 
change markedly (Supplementary Fig. 9c and Supplementary Table 27). 
Only 831 splicing events have a AY value >90. Further statistical 
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Figure 4 | Developmentally regulated splicing events. a, Conservation of 
internal constitutive and cassette exons >50 nucleotides that were annotated or 
new discoveries. (Annotated constitutive, n = 26,127; annotated cassette, 

n = 438; modENCODE cassette, n = 173; modENCODE constitutive, n = 306; 
FlyBase 5.12 constitutive to modENCODE cassette, n = 304.) b, Clusters of 
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regulated cassette exon events during development. The scale bar indicates 
Z-scores of ¥. c, Regulated alternative splicing in CadN during embryogenesis. 
The maximal number of reads in the poly(A) RNA-Seq data are indicated for 
each track. 
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analyses (see Supplementary Methods) identified 13,951 (66%) alterna- 
tive splicing events that change significantly throughout development 
(Supplementary Table 28). 

Hierarchical clustering of cassette exon events revealed the dynamic 
nature of splicing throughout development (Fig. 4b), as exemplified by 
Cadherin-N (CadN), a gene with three sets of mutually exclusive exons 
(Fig. 4c). In each set, one exon is preferentially included in early embryos, 
the other in late embryos, with a smooth transition between the two. Our 
analysis also identified groups of exons that have coordinated splicing 
patterns (Fig. 4b). A set of 55 genes contain exons that are preferentially 
included in early embryos, late larvae, early pupae and females but 
skipped in all other stages. Gene Ontology (GO) analysis of these genes 
indicates that many encode proteins involved in epithelial cell-to-cell 
junctions. GO analysis of genes that contain exons preferentially 
included during late pupal and adult stages indicates that many encode 
proteins that are part of neuronal synapses. 


Sex-biased alternative splicing 


Sex determination in Drosophila is mediated by a cascade of regulated 
alternative splicing events involving Sex lethal (Sxl), transformer (tra), 
male-specific lethal 2 (msl-2), doublesex (dsx) and fruitless (fru) that 
specify nearly all physical and behavioural dimorphisms between 
males and females as well as X chromosome dosage compensation”. 
Our RNA-Seq data confirm sex-biased (AY =|Pmate— Wrematel) 
splicing of Sxl (AY = 89.6), tra (AV = 39.2), dsx (AW = 59.7) and 
fru (A¥ = 100). 

In addition to the canonical sex-determination cascade, we iden- 
tified 119 strongly sex-biased splicing events (AY > 70) (Supplemen- 
tary Fig. 9d). One striking example is Reps, which was annotated as 
containing six constitutive exons. RNA-Seq data indicate that exon 
five is a sex-biased alternative cassette exon (AY = 73.39) (Supplemen- 
tary Fig. 10). This highly conserved exon is included in males and 
skipped in females. The intron upstream of this cassette exon contains 
conserved SXL binding sites, indicating that it is regulated by SXL and is 
a candidate sex differentiation gene. 


Discovery of RNA editing sites 

Previous studies identified 127 sites in 55 Drosophila genes that 
undergo A-to-I RNA editing*'. This post-transcriptional modifica- 
tion is catalysed by dADAR, which is expressed at increasing levels 
throughout development and is thought to target products involved in 
nervous system function. We analysed the poly(A) * RNA-Seq data to 
identify exonic nucleotide positions consistent with A-to-I editing 
and defined 972 edited positions within transcripts of 597 genes, 
including previously described edited sites in the transcripts of 36 
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Figure 5 | Discovery of RNA editing events. a, Rows represent edited sites. 
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genes (Supplementary Table 29). These genes include those required 
for rapid neurotransmission and other widely ranging functions. For 
most sites, the frequency of editing increases throughout development 
and does not correlate with overall expression levels (Fig. 5a). Editing 
typically begins in late pupal stages, although we find transcripts that 
seem to be edited in late embryogenesis. Consistent with earlier studies”, 
exons containing editing sites are more highly conserved than unedited 
exons. The majority of the edited positions (630) alter amino acid cod- 
ing, the others are either silent (201) or within untranslated regions 
(141). For example, the transcripts of quiver (qvr) are edited at six posi- 
tions, four that result in amino acid changes (Fig. 5b). qvr encodes a 
potassium channel subunit that modulates the function of the voltage- 
gated Shaker (SH) potassium channel. Sh transcripts are also edited 
at multiple positions*. The combinatorial editing of both proteins 
probably has an important role in modulating action potentials in 
the arthropod nervous system and may have implications for the regu- 
lation of sleep**. Expressed sequence tags, long poly(A) * RNA-Seq and 
cDNAs cross-validate nearly one-quarter (214) of the newly discovered 
sites. 

Computational analysis identified three potential editing-associated 
sequence motifs (Fig. 5a). We observe 381 sites with one or more motifs 
in close proximity to the edited nucleotide (Supplementary Table 30). 
Motif C, although less common than motifs A and B, is more strongly 
associated with the editing site. Most (93%) instances of motif C occur 
on the sense strand of the transcript and the A at the 3’ end of the motif 
is the edited nucleotide. This motif is over-represented in editing 
events that occur early in development. 


Discussion 


Our interrogation of the transcriptome of D. melanogaster throughout 
development has considerably expanded the number of building 
blocks used to make a fly. Specifically, we identified nearly 2,000 
NTRs, increased the number of alternative splicing events by threefold 
and the number of RNA editing sites by an order of magnitude. The 
resulting view of the transcriptome at single-base resolution markedly 
improves our understanding of expression dynamics throughout the 
Drosophila life cycle and has substantial biological implications. 

The D. melanogaster, C. elegans and human genomes are organized 
quite differently. Specifically, 20%, 45% and 2.5% of the D. melanogaster, 
C. elegans and human genomes, respectively, encode exons or mature 
transcripts. Primary transcripts comprise a larger fraction of each 
genome—60%, 82% and 37%. This highlights the fact that primary 
transcripts and introns are much shorter in D. melanogaster and C. 
elegans than in human and that the D. melanogaster and C. elegans 
genomes are more compact than the human genome. 
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The existence of unannotated genes was indicated by microarray 
studies**° and conservation among Drosophilid genomes”. However, 
the NTRs that we identified were not identified by comparative sequence 
analysis*° as they are less conserved than most previously known genes. 
This emphasizes the importance of using both comparative analyses and 
transcriptome profiling for genome annotation. 

Despite the depth of our sequencing, the annotation of the D. mela- 
nogaster transcriptome is not finished. We failed to detect expression of 
1,488 annotated genes including members of gene families to which 
short reads can not be uniquely mapped and genes expressed at low 
levels or in spatially and temporally restricted patterns. Moreover, 
although we substantially increased the fraction of genes that encode 
alternatively spliced or edited transcripts, we again failed to detect several 
annotated RNA processing events. Study of more temporally and spa- 
tially restricted samples will allow deeper exploration of the Drosophila 
transcriptome, and almost certainly result in the discovery of yet addi- 
tional features. Furthermore, functional studies of the new and previ- 
ously unstudied elements will provide valuable insight into metazoan 
development. 


METHODS SUMMARY 


Animal staging, collection and RNA extraction. Isogenic (y'; cn bw’ sp’) 
embryos were collected at 2-h intervals for 24 h. Collection of later staged animals 
started with synchronized embryos and included resynchronizing with appro- 
priate age indicators. Six larval, six pupal and three adult sexed stages, 1, 5 and 30 
days, were collected. RNA was isolated using TRIzol (Invitrogen), DNased and 
purified on an RNAeasy column (Qiagen). poly(A)" RNA was prepared from an 
aliquot of each total RNA sample using an Oligotex kit (Qiagen). 

Tiling arrays. RNAs from three biological replicates of each sample were inde- 
pendently hybridized on 38-bp arrays (Affymetrix GeneChip Drosophila Tiling 
2.0R array) as described”. 

RNA-Seq. Libraries were generated and sequenced on an Illumina Genome 
Analyser IIx using single or paired-end chemistry and 76-bp cycles. SOLiD sequen- 
cing used total RNA treated with the RiboMinus Eukaryote Kit (Invitrogen). 
Samples were fragmented, adaptors ligated (Ambion) and sequenced for 50 bases 
using SOLID V3 chemistry. 454 sequencing used poly(A)” RNA from Oregon R 
adult males and females and mixed-staged y’; cn bw’ sp’ embryos. Sequences are 
available from the Short Read Archive and the modENCODE website (http:// 
www.modencode.org/). 

Targeted RT-PCR and cDNA isolation and sequencing. Standard procedures 
were used for RT-PCR and targeted cDNA isolation and sequencing. 

Analysis. Cufflinks** was used to identify new transcript models and to calculate 
expression levels for annotated and predicted transcript models. MFold** was 
used to predict secondary structures from the new snoRNA-like RNAs. 
JuncBASE” identified alternative splicing events and calculated per cent spliced 
in (Y)°*. Editing sites were identified by comparing aligned reads to the reference 
genome. 
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Comprehensive analysis of the chromatin 
landscape in Drosophila melanogaster 
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Peter J. Sabo!®, Erica Larschan**", Andrey A. Gorchakov*, Tingting Gu’, Daniela Linder-Basso°}, Annette Plachetka**, 
Gregory Shanower+, Michael Y. Tolstorukov'”, Lovelace J. Luquette’, Ruibin xi, Youngsook L. Jung’, Richard W. Park’, 
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John A. Stamatoyannopoulos'®’*, Manolis Kellis®’, Sarah C. R. Elgin’, Mitzi I. Kuroda**, Vincenzo Pirrotta®, Gary H. Karpen®* 


& Peter J. Park!*?* 


Chromatin is composed of DNA and a variety of modified histones and non-histone proteins, which have an impact on cell 
differentiation, gene regulation and other key cellular processes. Here we present a genome-wide chromatin landscape 
for Drosophila melanogaster based on eighteen histone modifications, summarized by nine prevalent combinatorial 
patterns. Integrative analysis with other data (non-histone chromatin proteins, DNase I hypersensitivity, GRO-Seq 
reads produced by engaged polymerase, short/long RNA products) reveals discrete characteristics of chromosomes, 
genes, regulatory elements and other functional domains. We find that active genes display distinct chromatin 
signatures that are correlated with disparate gene lengths, exon patterns, regulatory functions and genomic contexts. 
Wealso demonstrate a diversity of signatures among Polycomb targets that include a subset with paused polymerase. This 
systematic profiling and integrative analysis of chromatin signatures provides insights into how genomic elements are 
regulated, and will serve as a resource for future experimental investigations of genome structure and function. 


The model organism Encyclopedia of DNA Elements (modENCODE) 
project is generating a comprehensive map of chromatin components, 
transcription factors, transcripts, small RNAs and origins of replication 
in Drosophila melanogaster and Caenorhabditis elegans'”. Drosophila 
has been used as a model system for over a century to study chro- 
mosome structure and function, gene regulation, development and 
evolution. The availability of high-quality euchromatic and heterochro- 
matic sequence assemblies*°, extensive annotation of functional ele- 
ments®, and a vast repertoire of experimental manipulations enhance 
the value of epigenomic studies in Drosophila. 

Genome-wide profiling of chromatin components provides a rich 
annotation of the potential functions of the underlying DNA sequences. 
Previous work has identified patterns of post-translational histone modi- 
fications and non-histone proteins associated with specific elements (for 
example, transcription start sites, enhancers), as well as delineating the 
transcriptional status of genes and large domains’*. Here we present a 
comprehensive picture of the chromatin landscape in a model eukaryotic 
genome. We define combinatorial chromatin ‘states’ at different levels of 
organization, from individual regulatory units to the chromosome level, 
and relate individual states to genome functions. 


Combinatorial chromatin states 


We performed chromatin immunoprecipitation (ChIP)-array ana- 
lysis for numerous histone modifications and chromosomal proteins 


(Supplementary Table 1), using antibodies tested for specificity and 
cross-reactivity? (Supplementary Fig. 1). Here we describe analyses of 
cell lines S2-DRSC (S2) and ML-DmBG3-c2 (BG3), derived from late 
male embryonic tissues (stages 16-17) and the central nervous system 
of male third instar larvae, respectively (see http://www.modencode.org 
for data from other cell lines and animal stages). Analysis reveals groups 
of correlated features, including those associated with heterochromatic 
regions’, Polycomb-mediated repression"’, and active transcription’ 
(Supplementary Fig. 2), similar to those observed in other organisms’*™*. 
This indicates that specific histone modifications work together to 
achieve distinct chromatin ‘states’. 

We used a machine-learning approach to identify the prevalent 
combinatorial patterns of 18 histone modifications, capturing the over- 
all complexity of chromatin profiles observed in $2 and BG3 cells with 9 
combinatorial states (Fig. la, Methods). The model associates each 
genomic location with a particular state, generating a chromatin-centric 
annotation of the genome (Fig. 1b). We examined each state for enrich- 
ment in non-histone proteins (Fig. la and Supplementary Fig. 3) and 
gene elements, as well as distribution across the karyotype (Fig. 1b and 
Supplementary Fig. 4) and finer-scale levels (Fig. 1c-e). 

Most distinct chromatin states are associated with transcriptionally 
active genes. Active promoter and transcription start site (TSS)-proximal 
regions are identified by state 1 (Fig. 1; red), marked by prominent 
enrichment in H3K4me3/me2 (tri/dimethylation of residue K4 of 
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Figure 1 | Chromatin annotation of the Drosophila melanogaster genome. 
a, A 9-state model of prevalent chromatin states found in $2 and BG3 cells. Each 
chromatin state (row) is defined by a combinatorial pattern of enrichment (red) 
or depletion (blue) for specific chromatin marks (first panel, columns; active 
marks in green, repressive in blue). For instance, state 1 is distinguished by 
enrichment in H3K4me2/me3 and H3K9ac, typical of transcription start sites 
(TSS) in expressed genes. The enrichments/depletions are shown relative to 
chromatin input (S2 data shown, see Supplementary Fig. 3 for BG3 data and 
histone density normalization). The second panel shows average enrichment of 
chromosomal proteins. The third panel shows fold over/under-representation 
of genic and TSS-proximal (+1 kb) regions relative to the entire tiled genome. 
The enrichment of intronic regions is relative to genic regions associated with 
each state. b, A genome-wide karyotype view of the domains defined by the 
9-state model in S2 cells. Centromeres are shown as open circles, and dashed 


histone H3) and H3K9ac (acetylation of K9 of histone H3). The tran- 
scriptional elongation signature associated with H3K36me3 enrichment 
is captured by state 2 (purple), found preferentially over exonic regions of 
transcribed genes. State 3 (brown), typically found within intronic 
regions, is distinguished by high enrichment in H3K27ac, H3K4mel 
and H3K18ac. A related chromatin signature is captured by state 4 
(coral), distinguished by enrichment of H3K36mel, but notably lacking 
H3K27ac. The number of genes associated with each chromatin state and 
the distribution of states within genes are shown in Supplementary Fig. 5. 

Several aspects of large-scale organization are revealed by the karyotype 
view (Fig. 1b). Chromosome X is markedly enriched for state 5 (green), 
distinguished by high levels of H4K16ac in combination with H3K36me3 
and other marks of ‘elongation’ state 2 (a combinatorial pattern associated 
with dosage compensation in male cells'*). Pericentromeric heterochro- 
matin domains and chromosome 4 are characterized by high levels of 
H3K9me2/me3 (state 7, dark blue)'®. Finally, the model distinguishes 
another set of heterochromatin-like regions containing moderate levels 
of H3K9me2/me3 (state 8, light blue; Fig. le). Surprisingly, this state 
occupies extensive domains in autosomal euchromatic arms in BG3 cells, 
and in chromosome X in both cell lines'®. 

Further aspects of chromatin organization can be visualized by folding 
the chromosome using a Hilbert curve (Fig. 2a)'’, which maintains the 


lines span gaps in the genome assembly. Several prominent chromatin 
organization features are illustrated (colour code in a), including the extent of 
pericentromeric heterochromatin (state 7) and the H4K16ac-driven signature 
of the dosage-compensated male X chromosome (state 5). (BG3 in 
Supplementary Fig. 4.) ce, Examples of chromatin annotation at specific loci. 
c, Two distinct chromatin signatures of transcriptionally active genes: one (left) 
is associated with enrichment in marks of states 3 and 4, whereas the other 
(right) is limited to states 1 and 2, recapitulating well established TSS and 
elongation signatures (note that small patches of state 7 in CG13185 illustrate 
H3K9me2 found at some expressed genes in S2 cells'®). d, A locus containing 
two Polycomb-associated domains, silent (left) and balanced (right). e. A large 
state 8 domain located within euchromatic sequence in BG3 cells, enriched for 
chromatin marks typically associated with heterochromatic regions, but at 
lower levels than in pericentromeric heterochromatin (state 7). 


spatial proximity of nearby elements. Thus, local patches of correspond- 
ing colours reveal the sizes and relative positions of domains associated 
with particular chromatin states (Fig. 2b and Supplementary Figs 6-9). 
For instance, specks of TSS-proximal regions (state 1) are typically con- 
tained within larger blocks of transcriptional elongation marks (state 2), 
which are in turn encompassed by extensive patches of H3K36mel- 
enriched domains (state 4) and variable-sized blocks of state 3. The 
clusters of open chromatin formed by these gene-centric patterns are 
separated by extensive silent domains (state 9) and regions of Polycomb- 
mediated repression (state 6). Factors responsible for domain bound- 
aries were not identified in our analysis (Supplementary Fig. 10). 

We also developed a multi-scale method to characterize chromatin 
organization at the spatial scale appropriate for the genome properties 
being investigated. For example, we observe that chromatin patterns most 
accurately reflect the replication timing of the $2 genome at scales of 
~170 kb (Supplementary Information, section 1). This is consistent with 
size estimates of chromatin domains influencing replication timing"’, and 
suggests that multiple replication origins are coordinately regulated by the 
local chromatin environment (each replicon is ~28-50 kb’). 

To examine combinatorial patterns not distinguished by the simplified 
9-state model, we also generated a 30-state combinatorial model that uses 
presence/absence probabilities of individual marks’? (Supplementary 
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Figure 2 | Visualization of spatial scales and organization using compact 
folding. a, The chromosome is folded using a geometric pattern (Hilbert space- 
filling curve) that maintains spatial proximity of nearby regions. An illustration 
of the first four folding steps is shown. Note that although this compact curve is 
optimal for preserving proximity relationships, some distal sites appear adjacent 
along the fold axis (green dots). b, Chromosome 3L in S2 cells. A domain of a 
given chromatin state appears as a patch of uniform colour of corresponding 
size. Thin black lines are used to separate regions that are distant on the 
chromosome. The folded view illustrates chromatin organization features that 
are not easily discerned from a linear view: active TSSs (state 1) appear as small 
specks surrounded by elongation state 2, commonly next to larger regions 
marked by H3K36mel-driven state 4, which also contains patches of intron- 
associated state 3. These open chromatin regions are separated by extensive 
domains of state 9. See Supplementary Figs 6 and 7 for other chromosomes and 
BG3 data. The folded views can be browsed alongside the linear annotations and 
other relevant data online: http://compbio.med.harvard.edu/flychromatin. 


Fig. 11). The increased number of states can identify finer variations that 
are biologically significant, for example, a signature corresponding to 
transcriptional elongation in heterochromatic regions"®. 


Chromatin state variation among genes 

Active genes generally display enrichments or depletions of individual 
marks at specific gene segments (Fig. 3a). When classified according 
to their chromatin signatures (Supplementary Fig. 12), active genes 
fall into subclasses correlated with expression magnitude (Sup- 
plementary Information, section 2), gene structure and genomic 
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Figure 3 | Chromatin patterns associated with transcriptionally active 
genes. a, Location and extent of chromatin features relative to boundaries of 
expressed genes (=1 kb) in BG3 cells. The colour intensity indicates the relative 
frequency of enrichment/depletion (red/blue) of a given mark within the gene 
(normalized independently for each mark). b, Regions enriched for ‘active’ 
chromatin marks in long transcribed genes. The plot shows the extent of regions 
enriched for various active marks at transcriptionally active genes (=4 kb) on 
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context (for example, heterochromatic genes combine H3K9me2/ 
me3 with some active marks)’*. Of particular interest is one class of 
long expressed genes, many with regulatory functions, which are 
enriched for H3K36mel (cluster 2, Supplementary Fig. 12; 131 genes 
in $2, 202 in BG3; Supplementary Table 2). 

To examine further the patterns associated with long genes, we 
clustered expressed autosomal genes =4 kb based on blocks of enrich- 
ment for each chromatin mark (Fig. 3b; 1,055 genes). We observe that 
genes with large 5’-end introns (green subtree, Fig. 3b; 552 genes) 
show extensive H3K27ac and H3K18ac enrichment, broader H3K9ac 
domains, and blocks of H3K36mel enrichment (chromatin state 3, 
Fig. 3b, last column). These genes are enriched for developmental and 
regulatory functions (Supplementary Table 3), and are positioned 
within domains of Nipped-B”' (Fig. 3b), a cohesin-complex loading 
protein previously associated with transcriptionally active regions*’”’. 
In contrast, genes with more uniformly distributed coding regions 
(red subtree, Fig. 3b) lack most state 3 marks, and H3K9ac enrichment 
is restricted to the 2 kb downstream of the TSS. These differences are 
not explained by variation in histone density (Supplementary Fig. 13). 
Overall, the presence or absence of state 3 is the most common dif- 
ference in the chromatin composition of expressed genes that are 1 kb 
and longer (Supplementary Fig. 14), and the presence of state 3 con- 
sistently correlates with a reduced fraction of coding sequence in the 
gene body, mainly associated with the presence of a long first intron. 

State 3 domains are highly enriched for specific chromatin remodelling 
factors (SPT16 (also known as DRE4) and dMI-2; Supplementary Figs 15 
and 16), whereas state 1 regions around active TSSs are preferentially 
bound by NURF301 (also called E(bx)) and MRG15. ISW1 is enriched in 
both states 1 and 3 (Supplementary Figs 16 and 17). State 3 domains also 
exhibit the highest levels of nucleosome turnover”, and show higher 
enrichment of the transcription-associated H3.3 histone variant™ than 
either the TSS- or elongation-associated states 1 and 2 (Supplementary 
Figs 15 and 16). Consistent with earlier analyses of cohesin-bound 
regions”’, state 3 sequences tend to replicate early in G1 phase, and show 
abundance of early replicating origins (Supplementary Fig. 18). A regu- 
latory role for state 3 domains is suggested by enrichment for a known 
enhancer binding protein (dCBP/p300*) in adult flies, and for enhancers 
validated in transgene constructs” (Supplementary Fig. 19). 


Modes of regulation in Polycomb domains 

In Drosophila, loci repressed by Polycomb group (PcG) proteins are 
embedded in broad H3K27me3 domains that are regulated by 
Polycomb response elements (PREs) bound by E(Z), PSC and dRING 
(Fig. 1d)’**°. We find that regions of H3K4mel enrichment surround all 
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BG3 autosomes. Each row represents a scaled gene. The first column illustrates 
coding exons; the last column shows chromatin state annotation. The clustering 
of the genes according to the spatial patterns of chromatin marks separates 
genes with a high fraction of coding sequence (red subtree, bottom) from genes 
containing long introns (green subtrees, top), which are associated with 
chromatin state 3 (last column) and binding of specific chromosomal proteins, 
such as Nipped-B”! (also see Supplementary Fig. 13). 
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PREs, 90% of which also display narrower peaks of H3K4me2 enrichment 
(Supplementary Fig. 20). Although this pattern is reminiscent of transcrip- 
tionally active promoter regions, PREs lack H3K4me3, indicating that a 
different mechanism of H3K4 methylation is used, perhaps involving the 
Trithorax H3K4 histone methyltransferase (HMTase) found at all PREs”. 

To examine chromatin states associated with PcG targets, we analysed 
the chromatin and transcriptional signatures of TSSs in Polycomb-bound 
domains (Fig. 4a and Supplementary Fig. 21). In addition to fully repressed 
TSSs (cluster 1, Fig. 4a), we identify TSSs maintained in the ‘balanced’ 
state” (cluster 2, Fig. 4a), distinguished by coexistence of Polycomb with 
active marks (including the HMTase ASH1) and production of full-length 
messenger RNA transcripts (for example, Psc domain, Fig. 1d). 

TSSs in clusters 3 and 4 are distinguished by the presence of adja- 
cent PREs (Fig. 4a). Surprisingly, 53% of the PRE-proximal TSSs 
produce short RNA transcripts” (cluster 3, Fig. 4a), indicating stalling 
of engaged RNA pol II°°. Using the global run-on sequencing (GRO- 
Seq) assay to accurately assess engaged RNA polymerases’, we 
observe that cluster 3 TSSs produce short transcripts in the sense 
orientation. The level of GRO™ signal is similar to that found at fully 
transcribed genes (Supplementary Fig. 22); thus, some transcription 
initiates in cluster 3, but elongation fails. Interestingly, these genes are 
enriched for regulatory and developmental functions, even more than 
other genes within Polycomb domains (see Supplementary Tables 4 
and 5). Genes without TSS-proximal PREs generally lack short tran- 
script signatures (for example, cluster 1 in Fig. 4a; see Supplementary 
Fig. 21 for exceptions). Importantly, engaged polymerases and tran- 
scripts are not a general feature of PREs; TSS-distal PREs typically lack 
short RNA and GRO-Seq signals (Fig. 4b and Supplementary Fig. 22) 
despite being similarly enriched in H3K4mel1/me2. The striking link 
between TSS-proximal PREs and the production of short RNAs sug- 
gests a potential mechanism for control of these developmental regu- 
latory genes, whereby the same features that recruit H3K4 methyl marks 
to PREs may also facilitate RNA pol II recruitment to nearby TSSs. 


DHS plasticity and chromatin states 


We used a DNase I hypersensitivity assay’*”’ to examine the distributions 
of putative regulatory regions and their relationships with chromatin 
states. DNase I hypersensitivity mapping broadly identifies sites with 
low nucleosome density and regions bound by non-histone proteins*”*. 
Short-read sequencing identified 8,616 high-magnitude DNase I hyper- 
sensitive sites (DHSs) in S2 cells and 6,354 in BG3 cells (and a com- 
parable number of low-magnitude DHSs; Supplementary Fig. 23 and 
Methods). Approximately half of the high-magnitude DHSs are found at 
transcriptionally active TSSs (Supplementary Fig. 24). Thus, the chro- 
matin context of the TSS-proximal DHSs is dominated by the features 
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expected for an active TSS, including RNA pol II, H3K4me3 and other 
state 1 marks (clusters 1, 2; Fig. 5a and Supplementary Fig. 25). 

Of the 36% TSS-distal DHSs, most (60%) are positioned within 
annotated expressed genes (Supplementary Fig. 24). These gene-body 
DHSs are distinguished from TSS-proximal DHSs by low H3K4me3, 
higher levels of H3K4me1, H3K27ac, and other marks linked to chro- 
matin state 3 (clusters 3, 4; Fig. 5a and Supplementary Fig. 26). An 
additional 20% of the TSS-distal DHSs are outside of annotated genes, 
but show signatures associated with active transcription starts or 
elongation, suggesting new alternative promoters or unannotated 
genes (Supplementary Figs 27 and 28). The remaining 20% of TSS- 
distal DHSs that appear to be intergenic (6% of all DHSs) are typically 
enriched for H3K4mel, but lack other active marks (cluster 5, Fig. 5a). 

Most DHS positions fall into the TSS-proximal state 1 or the intron- 
biased state 3 (Fig. 5b). State 3 lacks H3K4me3 and is enriched for 
H3K4mel, H3K27ac and H3K18ac, similar to mammalian enhancer 
elements*’. Many state 3 DHS positions are occupied by known regulatory 
proteins: GAGA factor binds to 49% of these DHSs in 82 cells, and 
developmental transcription factors bind to 44% of these DHSs in 
embryos**. Notably, we find that TSS-distal DHSs in Drosophila exhibit 
low-level bi-directional transcripts (Fig. 5a, shortRNA panel; see also Sup- 
plementary Figs 29 and 30), analogous to the enhancer RNAs (eRNAs) 
characterized in mice’’. Analysis of GRO-Seq data (Fig. 5e) indicates that 
eRNA-like transcripts are common to both intra- and intergenic TSS- 
distal DHSs in Drosophila, a feature that is conserved with mammals. 

The association of DHSs with chromatin states 1 and 3 (Fig. 5c) 
persists even in chromosome 4 and pericentromeric heterochromatin, 
where such states are infrequent (Supplementary Fig. 31). This suggests 
that these chromatin states and associated remodelling factors (for 
example, ISWI, SPT16) provide the context necessary for non-histone 
chromosomal protein binding at DHSs, or are the consequence of such 
binding events. To investigate this interdependency, we analysed a 
high-confidence set of loci that exhibit DHSs in only one of the two 
examined cell lines (Supplementary Fig. 32). Surprisingly, although in 
general more DHSs are in state 1 regions, 91% of the cell-type-specific 
DHSs are found within state 3 domains (14-fold increase compared to 
state 1 DHSs; Supplementary Table 6 and Fig. 5d). Comparison with 
DHSs in an additional cell type (Kc167, Supplementary Fig. 33) con- 
firms that DHSs displaying plasticity between cell types are mostly 
found in state 3. When DHSs are absent, the altered loci maintain 
chromatin state 3 in 23% of the cases (Fig. 5d), indicating that the 
presence of state 3 is not always dependent on the DHS. More fre- 
quently, the altered loci transition to state 4 (43% of the cases), an open 
chromatin state that lacks many of the histone modifications and chro- 
matin remodellers characteristic of state 3. Although the less frequent 
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Figure 4 | Signatures of TSSs within domains of Polycomb-mediated 
repression. a, Distinct classes of TSSs in $2 cell Polycomb domains. Each row 
represents a TSS. Clusters 1-5 illustrate distinct TSS states (see Supplementary 
Fig. 21 for complete set of clusters). Cluster 1 shows fully repressed TSSs with 
the expected pattern of PC and H3K27me3 enrichment; cluster 2 shows 21 TSSs 
found within ASH1 domains, maintained in a balanced state. Clusters 3 and 4 
distinguish TSSs located in the immediate proximity of Polycomb response 
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elements (PREs), showing the symmetrical H3K4me1/me2 enrichment typical 
of all PREs. Many such TSSs (cluster 3, 42 TSSs) produce short, non- 
polyadenylated transcripts along the sense strand (GRO*/shortRNA* 
columns), indicating the presence of paused polymerase. b, PRE positions 
distant from annotated TSSs. TSS-distal PREs exhibit enrichment for 
H3K4mel/me2, but are not associated with GRO or shortRNA signatures. 
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Figure 5 | Chromatin signatures of regulatory elements identified by DNase 
I hypersensitivity. a, Representative classes of high-magnitude DNase I 
hypersensitive sites (DHSs) and chromatin signatures in S2 cells. TSS-proximal 
(within 2 kb) DHSs show chromatin signatures expected of expressed gene 
promoters: high H3K4me3 and RNA pol II signal extending in the direction of 
transcription (left to right; cluster 2 groups bi-directional promoters). TSS- 
distal DHSs are associated with high H3K4mel1 and low H3K4m¢3 levels. Most 
TSS-distal DHSs found within the bodies of expressed genes (clusters 3, 4) are 
associated with chromatin state 3. A cluster of rare intergenic DHSs (cluster 5) 
is associated with localized peaks of H3K4me1/2 (complete sets of clusters in 
Supplementary Figs 25, 26 and 28). b, Distribution of DHS positions among 
chromatin states. The vast majority of DHSs are found within the TSS-proximal 
state 1 or enhancer-like state 3 regions. c, States 1 and 3 exhibit the highest 


transitions to the Polycomb state 6 (7%) or background state 9 (17%) 
typically coincide with gene silencing, most of the genes that maintain 
state 3 or transition to state 4 remain transcriptionally active (Supplemen- 
tary Fig. 34). These observations provide further support for an enhancer- 
like function for state 3 DHSs, and suggest a more subtle regulatory role 
than simple linkage to the presence or absence of gene expression. 


Chromatin annotation of genome functions 


The genomic chromatin state annotation and discovery of refined 
chromatin signatures for chromosomes, domains, and subsets of regu- 
latory genes demonstrate the utility of a systematic, genome-wide pro- 
filing of an organism that is already understood in considerable detail. 
Clearly, the definition and functional annotation of chromatin patterns 
will be enhanced by incorporation of data for different types of com- 
ponents. Five ‘colours’ of chromatin were recently identified in Kc167 
cells using chromosomal protein maps*’. Comparison with our 9-state 
model shows similarities as well as differences in the ability to distin- 
guish functional elements (Supplementary Fig. 35); thus, further integ- 
ration of such data in the same cell type may resolve additional 
functional features. Our results illustrate the utility of integrating mul- 
tiple data types (histone marks, non-histone proteins, chromatin 
accessibility, short RNAs and transcriptional activity) for comprehens- 
ive characterization of functional chromatin states. 

An important, repeated theme is that chromatin state analysis 
identifies unexpected distinctions between subsets of active genes. 
Besides the differences linked to genomic context (for example, male 
X chromosome, heterochromatin), the main source of variability is 
the presence of the acetylation-rich state 3 (Fig. 6). Several lines of 
evidence suggest that the intronic positions marked by state 3 are 
important for gene regulation. State 3 regions show specific associa- 
tions with known chromatin remodellers (SPT16, dMi-2 and ISWI) 
and gene regulatory proteins (for example, GAF, dCBP/p300), and the 
highest rates of nucleosome turnover and transcription-dependent 
deposition of the H3.3 variant. State 3 genes are also bound by cohesin 
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-2,000 0 
DHS-relative position 
density of DHSs. d, Cell-line-specific DHSs are positioned predominantly 
within the enhancer-like state 3. The transition matrix shows the chromatin 
state of loci containing DHSs in one cell line (x axis), and the state of the same 
locus in the other cell line where the DHS is absent (y axis). Most of the DHSs 
that differ between cell lines originate from state 3. When DHSs are absent, the 
loci typically transition to an open chromatin state 4 (43%), or maintain state 3 
(23%). In both scenarios, most of the associated genes remain transcriptionally 
active (see Supplementary Fig. 34). e, Low levels of engaged RNA polymerase 
are associated with TSS-distal DHSs. The top plot shows the local increase in 
the antisense GRO-Seq signal for DHSs located within transcribed genes; 
dashed lines show median levels. Intergenic DHS positions (bottom plot) also 
show bi-directional GRO-Seq signal of comparable magnitude. See 
Supplementary Figs 27, 29 and 30. 


complex proteins, thought to associate with decondensed chromatin”! 
to promote looping interactions with promoter regions™. 

A regulatory role for state 3 chromatin is further suggested by the 
high density of DHSs, comparable to that of active TSS state 1, and the 
fact that state 3 accounts for most of the DHS plasticity among cell 
types. The combinations of histone marks found in state 3 are similar to 
signatures of mammalian enhancers”, which also show high variability 
between cell types*’. Furthermore, state 3 DHSs exhibit low levels of 
short, non-coding bidirectional transcripts reminiscent of eRNAs iden- 
tified in mice*’. Together, these findings suggest that state 3 regions 
contain enhancers or other regulatory elements, and that a combination 
of modifications can be used to identify new elements in the genome. 

Genes within repressive Polycomb domains also display several distinct 
combinatorial chromatin patterns (Fig. 4a), which probably represent a 
range of functional states: repressed, paused, or expressed genes in either 
balanced” or fully activated states. Alternatively, distinct signatures might 
mark subsets of regulatory genes that require either long-term repression 
or the ability to reverse functional states, depending on environmental or 
developmental cues. The PRE-proximal paused TSSs have some similarity 
to the ‘bivalent’ genes in mammalian cells, which also display transcrip- 
tional pausing of key regulatory and developmental genes*"’. However, 
the mammalian ‘bivalent state’ is characterized by the simultaneous 
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Figure 6 | Spatial arrangements of chromatin states associated with active 
transcription. Unlike short or exon-rich expressed genes, expressed genes with 
long intronic regions commonly contain one or more regions of enhancer-like 
state 3, associated with specific chromosomal proteins, high nucleosome 
turnover and DHSs displaying cell-type plasticity. 
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presence of PcG proteins, H3K27me3 and H3K4me3, which in 
Drosophila is found only in the fully elongating ‘balanced’ state”. 
Comprehensive analysis of chromatin signatures has enormous 
potential for annotating functional elements in both well studied 
and new genomes. Going forward, our systematic characterization 
of the epigenomic and transcriptional properties of Drosophila cells 
should spur in-depth experimental analyses of the relationship 
between chromatin states and genome functions, ranging from whole 
chromosomes down to individual regulatory elements and circuits. 


METHODS SUMMARY 


Histone modification and chromosomal protein antibodies were characterized 
for cross-reactivity. ChIP-chip was performed in duplicate, using Affymetrix 
Drosophila Tiling 2.0R Arrays. Digital DNase I-Seq assays were performed as 
described previously“, and Global Run-On library (GRO-Seq) data was generated 
as described previously*'. Short RNA data was generated by ref. 30, and RNA-Seq 
data was generated by ref. 45. See ref. 46 for other modENCODE RNA-Seq data. 
The chromatin state models were generated as hidden Markov models (HMMs) of 
different histone marks. DHSs were identified as read density peaks significantly 
enriched relative to the genomic DNA control. Clustering of chromatin signatures 
was determined using the PAM algorithm. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Growth conditions. ML-DmBG3-c2 cells were obtained from DGRC (https:// 
dgrc.cgb.indiana.edu/), and S2-DRSC cells were from the DRSC (http:// 
www.flyrnai.org/). All cell lines were grown to a density of ~5 X 10° cells ml”! 
in Schneider’s media (Gibco) supplemented with 10% FCS (HyClone). 10 pg 
ml’ insulin was added to the ML-DmBG3-c2 media. 

Antibodies. Antibodies are listed in Supplementary Table 1. Commercial 
antibodies against modified histones were tested by western blot for the lack of 
cross-reactivity with the corresponding recombinant histone produced in 
Escherichia coli and non-histone proteins from embryonic nuclear extracts. 
Antibody specificity was further assayed by western dot/slot blot against a panel 
of synthetic modified histone peptides. Only antibodies that showed <50% of 
total signal associated with non-histone proteins, and more than fivefold higher 
affinity for the corresponding histone peptide, were used in ChIP experiments. 

The specificity of antibodies against chromosomal proteins was tested by western 

blots with nuclear extracts prepared from mutant flies or S2 cells subjected to RNAi 
knockdown”. An antibody was considered specific if it recognized a major band of 
expected mobility that was absent in the sample prepared from mutant flies, or 
diminished twofold or more after RNAi depletion. When possible, distributions of a 
chromosomal protein were mapped with two antibodies generated against different 
epitopes (see Supplementary Fig. 17). Data from chromatin proteins for which only 
one antibody was available were validated by comparison with published genomic 
distributions for a different component of the same complex, or to published 
genomic distributions generated with a different antibody. 
ChIP and microarray hybridization. Crosslinked chromatin from cultured cells 
was prepared as described”* with the following modifications. Before ultrasound 
shearing, cells were permeabilized with 1% SDS, and shearing was done in TE- 
PMSF (0.1% SDS, 10 mM Tris-HCl pH 8.0, 1mM EDTA pH 8.0, 1mM PMSF) 
using a Bioruptor (Diagenode) (2 X 10 min, 1 X 5 min; 30s on, 30s off; high 
power setting). 

ChIP was performed as in ref. 28 and immunoprecipitated DNA was amplified 
using the whole genome amplification kit (WGA2, Sigma) according to the 
manufacturer’s instructions (chemical fragmentation step was omitted). The 
amplified material was labelled and hybridized to Drosophila Tiling Arrays 
v2.0 (Affymetrix) as in ref. 28. 

Processing of ChIP data. At least two independent biological replicates were 
assessed for each ChIP profile. The log, intensity ratios (M values) were calculated 
for each replicate. The profiles were smoothed using local regression (lowess) 
with 500 bp bandwidth, and the genome-wide mean was subtracted. The regions 
of significant enrichment were determined as clusters of at least 1 kb in length, 
with gaps no more than 100 bp where M value exceeds a statistically significant 
(0.1% false discovery rate (FDR)) enrichment threshold. The set of biological 
replicates was deemed consistent if the enriched regions from individual experi- 
ments had a 75% reciprocal overlap, or if at least 80% of the top 40% of the regions 
identified in each experiment were identified in the other replicate (before com- 
parison the replicates were size-equalized by increasing the significance threshold 
for a replicate with more enriched sequence). The data from individual replicates 
were then combined using local regression smoothing, and used for all of the 
presented analysis, unless noted otherwise. 

DNase I hypersensitivity. Digital DNase I-Seq assays were performed as described 
previously**. The sequenced reads were aligned to the Berkeley Drosophila 
Genome Project release 5 (BDGP.R5) genome assembly, recording only uniquely 
mappable reads. To detect DNase I hypersensitive sites, hotspot positions were 
identified based on a 300-bp scanning window statistic (Poisson model relative to 
50 kb background density, Z-score threshold of 2), and peaks of read density were 
selected within the hotspots using randomization-based thresholding at 0.1% FDR. 
The set of high-magnitude DHSs analysed here (except for Supplementary Fig. 23) 
was identified as a subset of all peaks that show statistically significant enrichment 
over the normalized genomic DNA read density profile (using a 300-bp window 


centred around the peak, binomial model, with Z-score threshold of 3). This 
method controls for copy number variation and sequencing/mapping biases; 
however, it may also reduce the sensitivity of DHS detection. In the DHS chro- 
matin profile clustering analysis (Fig. 5a, relevant Supplementary figures), DHSs 
found within 1 kb of another DHS were excluded if their enrichment magnitude 
(relative to genomic background) was lower (to avoid showing the same region 
more than once). 
RNA sequencing. The preparation of RNA-Seq libraries and sequencing is 
described in ref. 45. The sequenced reads were aligned to the BDGP.R5 genome 
assembly and annotated exon junctions, recording only uniquely mappable reads. 
The RPKM (reads per kilobase of exonic sequence per million reads mapped) was 
estimated for each exon. The total transcriptional output of each annotated gene 
was estimated based on the maximum ofall exons within the gene. The presented 
analysis uses logjo(RPKM-+ 1) values unless otherwise noted. 
GRO sequencing. Global Run-On library was prepared from S2 cells and 
sequenced as described*’. The reads were aligned to the BDGP.R5 genome assembly, 
recording only uniquely mappable reads. The smoothed profiles of reads mapping 
to each strand were calculated using Gaussian smoothing (¢ = 100 bp). The analysis 
uses log)9(d+1), where d is the smoothed density value. 
Short RNA data processing. The short RNA data for S2 cells was generated by 
ref. 30, and was aligned and processed in the same way as the GRO-Seq data. 
Chromatin state models. To derive a 9-state joint chromatin state model for $2 
and BG3 cells (Fig. 1a), the genome was first divided into 200-bp bins, and the 
average enrichment level was calculated within each bin based on unsmoothed 
log intensity ratio values taking into account individual replicates, using all 
histone enrichment profiles and PC to discount the genome-wide difference in 
S2 H3K27me3 profiles. The bin-average values of each mark were shifted by the 
genome-wide mean, scaled by the genome-wide variance, and quantile-normalized 
between the two cells. The hidden Markov model (HMM) with multivariate normal 
emission distributions was then determined from the Baum-Welch algorithm 
using data from both cell types, and 30 seeding configurations determined with 
K-means clustering. States with minor intensity variations (Euclidian distance of 
mean emission values <0.15) were merged. Larger models (up to 30 states) were 
examined, and the final number of states was chosen for optimal interpretability. 
An extensive discrete chromatin state model (Supplementary Fig. 11) was 
calculated as described in ref. 20. The model was trained using a 200-bp grid 
with binary calls (enriched/not enriched). The binary calls were made based on a 
5% FDR threshold determined from ten genome-wide randomizations for each 
mark. For H1, H4 and H3K23ac regions of significant depletion rather than 
enrichment were called. 
Regions of enrichment for individual marks. To determine contiguous regions 
of enrichment for individual marks, a three-state HMM was used, with states 
corresponding to enriched, neutral and depleted profiles (normally-distributed 
emission parameters: (4 = [—0.5 0 0.5], 0? = 0.3). The enriched regions were 
determined from the Viterbi path. The HMM segmentation was applied to 
unsmoothed M value data taking into account individual biological replicates. 
The genes were clustered based on the combinatorial pattern of occurrence of 
enriched regions (coding exons and state panels were not used for clustering). 
Classification of enrichment profiles. Clustering of chromatin signatures around 
TSSs (Fig. 4a), PREs (Fig. 4b) and DHSs (Fig. 5a and relevant Supplementary 
Information sections) was determined using the Partitioning Around Medoids 
algorithm. For clustering, each profile was summarized with average values within 
bins spanning +2-kb regions. One-hundred-base-pair bins were used for the 
central +500-bp region, 300-bp bins outside. 


47. Clemens, J. C. et al. Use of double-stranded RNA interference in Drosophila cell 
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Interaction-based quantum metrology showing 
scaling beyond the Heisenberg limit 


M. Napolitano!, M. Koschorreck', B. Dubost!?, N. Behbood', R. J. Sewell’ & M. W. Mitchell! 


Quantum metrology aims to use entanglement and other quantum 
resources to improve precision measurement’. An interferometer 
using N independent particles to measure a parameter V can 
achieve at best the standard quantum limit of sensitivity, 61 « 
N_"?. However, using N entangled particles and exotic states’, 
such an interferometer*® can in principle achieve the Heisenberg 
limit, 5 « N~'. Recent theoretical work*~ has argued that inter- 
actions among particles may be a valuable resource for quantum 
metrology, allowing scaling beyond the Heisenberg limit. 
Specifically, a k-particle interaction will produce sensitivity 51 « 
N_* with appropriate entangled states and 5X « N~“~') even 
without entanglement’. Here we demonstrate ‘super-Heisenberg’ 
scaling of 5X « N~ 3/2 in a nonlinear, non-destructive®® measure- 
ment of the magnetization’®"’ of an atomic ensemble’’. We use fast 
optical nonlinearities to generate a pairwise photon-photon inter- 
action”’ (corresponding to k = 2) while preserving quantum-noise- 
limited performance”™*. We observe super-Heisenberg scaling over 
two orders of magnitude in N, limited at large numbers by higher- 
order nonlinear effects, in good agreement with theory’’. For a 
measurement of limited duration, super-Heisenberg scaling allows 
the nonlinear measurement to overtake in sensitivity a comparable 
linear measurement with the same number of photons. In other 
situations, however, higher-order nonlinearities prevent this cross- 
over from occurring, reflecting the subtle relationship between 
scaling and sensitivity in nonlinear systems. Our work shows that 
interparticle interactions can improve sensitivity in a quantum- 
limited measurement, and experimentally demonstrates a new 
resource for quantum metrology. 

The most precise instruments are interferometric in nature, and 
operate according to the laws of quantum mechanics. A collection of 
particles, for example photons or atoms, is prepared in a superposition 
state, allowed to evolve under the action of a Hamiltonian containing 
an unknown parameter, 1’, and measured in agreement with quantum 
measurement theory. The complementarity of quantum measure- 
ments’° determines the ultimate sensitivity of these instruments. 

Here we describe polarization interferometry, used, for example, in 
optical magnetometry to detect atomic magnetization'''*”; similar theory 
describes other interferometers’. A collection of N photons, with circular 
plus- and minus-polarization eigenstates, |+) and |—), is ea Ni 
single-photon Stokes operators s;=(1/2)(|+), |—))oi((+], ( 
where o; (i=, y, Z) are the Pauli matrices, gp is the identity in a 
superscript “I” denotes transposition. In macaenat quantum metrology, 
a Hamiltonian of the form H=hxv bee 4 gl) , where /i denotes Planck’s 
constant divided by 2n, uniformly and independently couples the 
photons to %, the parameter to be measured". If the input state consists 
of independent photons, the possible precision scales as 8X oc N~1/?, 
the shot noise or standard quantum limit (SQL). The factor of N~ M2 
reflects the statistical averaging of independent results. In contrast, 
entangled states can be highly, even perfectly, correlated, giving pre- 
cision limited by 5. ocN~!, the Heisenberg limit. 


The above Hamiltonian is conveniently written H=hx Ss where 
= pane li ) isa collective variable describing the net polarization of 
the photons. The independence of the photons manifests itself in the 
linearity of this Hamiltonian. Recently, it has been shown that inter- 
actions among particles, or, equivalently, nonlinear Hamiltonians, can 
contribute to measurement sensitivity and give scaling beyond the 
Heisenberg limit*. For example, a Hamiltonian H =hx. S that is, with 
a kth-order pepe awa in S= (Sp Sy ,S:), contains k- photon inter- 
action terms 3 1) @gl? @. -@S; The number of such terms, and, 
thus, the signal strength, grows as Na , but the quantum noise from the 
input states is unchanged. As a result, a sensitivity limit of 5X oc N~* 
applies when entanglement is used, and SX ocN~‘*~'/?) in the 
absence of entanglement’. For k= 2, this gives scaling better than 
the Heisenberg limit, so-called super-Heisenberg scaling’. We note 
that interactions and entanglement are compatible and both improve 
the scaling. The predicted advantage applies generally to quantum 
interferometry, and proposed mechanisms to produce metrologically 
relevant interactions include Kerr nonlinearities!*, cold collisions in 
condensed atomic gases’, Duffing nonlinearity in nanomechanical 
resonators’ and a two-pass effective nonlinearity with an atomic 
ensemble”. Topological excitations in nonlinear systems may also give 
advantageous scaling”’. 

In this Letter, we study interaction-based quantum metrology using 
unentangled probe particles. One challenge in demonstrating super- 
Heisenberg scaling is to engineer a suitable nonlinear Hamiltonian. 
Some nonlinearities have been shown to be intrinsically noisy’ 
whereas others give super-Heisenberg scaling but fall short of the ideal, 
8X ocN~*-1/2) | under realistic conditions’. We use a cold atomic 
ensemble as a light-matter quantum interface’* to produce quantum- 
noise-limited interactions, and use a Hamiltonian of the form 
H=hXS,8) =hXS,N /2. This Hamiltonian gives a polarization rota- 
tion that increases with photon number, without increasing quantum 
noise’. The experiment, shown schematically in Fig. 1, uses pulses of 
near-resonant light to measure the collective spin, F, of an ensemble of 
Na ~ 10° cold rubidium-87 atoms, probed on the 5S,/2— 5P3/2 Dz 
line. The experimental system is described in detail in refs 8, 23. The 
on-axis atomic magnetization, (F z)» which plays the role of ¥ in this 
measurement, is prepared in the initial state (F 2) = Na by optical 
pumping with resonant, circularly polarized light propagating along 
the trap axis, z. A weak, on-axis magnetic field is applied to preserve F, 
during the measurements. 

Pulses of S,-polarized, but not entangled, photons pass through the 
ensemble and experience an optical rotation Begg to io 2): The 


light-atom interaction Hamiltonian Hepp = 0Y ES, + po )ELSLN /2 
‘one this paramagnetic Faraday rotation’’. Both the linear term, 

YE Ses and the nonlinear term, B' YE F, S N/2, cause rotation of the 
ae of polarization from S, (vertical) towards S (diagonal). 
Detection of Sy then allows estimation of F,. As described in Sup- 
plementary fnigeiaationss ‘) and Bp“ depend on the optical detuning, 
A, relative to the F= 1—> F’ = 0 transition; in particular, « oft Ao )=0 
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Figure 1] Atom-light interface. a, Experimental schematic: an ensemble of 
7 X 10° ®’Rb atoms, held in an optical dipole trap, is prepared in the state 

|F = 1, mp = 1) by optical pumping (OP). Linear (P, P2) and nonlinear (Pyr) 
Faraday rotation probe pulses (in the order P;, Py, P2) measure the atomic 


for the specific detuning 4p ~ 27 X 468.5 MHz, allowing a purely non- 
linear estimation to be studied. 

The rotation angle is ¢= (E 2) [A(A)+ B(A)N]/2, where A x ot 
and B x $ both account for the temporal pulse shape and the geo- 
metric overlap between the atomic density and the spatial mode of the 
probe. The shot-noise-limited uncertainty in the rotation angle, due to 
quantum uncertainty in the initial angle, is 6 = N~ "?/2. A contribution, 
(F ) B(A)N /2, from initial number fluctuations 5N = (N) “” is neg- 
ligible for small rotation angles. This gives a measurement uncertainty of 


~ \ dp 1 
oie) $ — A(A)N1/? + B(A)N3/? () 


indicating a transition from SQL scaling, OF, « N~ 12 to super- 
Heisenberg scaling 5F, «x N *” with increasing N. 

We use two probing regimes. The ‘linear probe’ consists of 40 1-1s 
pulses (total illumination time, t, = 40 [1s) spread over 400 pts with 
detuning 4, >> 4. Together with the number of photons, N;, used 
in the experiment for the linear probe, this gives A >> NB, that is, 
linear estimation, and provides® a projection-noise-limited quantum 
non-demolition measurement” of E,, with uncertainty at the parts- 
per-thousand level®. The ‘nonlinear probe’ consists of a single, 
Gaussian-shaped, high-intensity pulse with a full-duration at half- 
maximum of ty; = 54ns, Ny, photons and a detuning Ao, such that 
A < Ny_B. Crucially, having two probes allows us to calibrate the 
nonlinear measurement precisely using a highly sensitive and well- 
characterized independent measurement of the same sample. 

We probe the same sample three times for each preparation. First we 
use the linear probe, which gives a precise and non-destructive mea- 
surement of (F a) via the rotation angle, dy. Then we use the nonlinear 
probe, which gives a rotation angle, (x that is calibrated against the 
‘true’ value (that is, with negligible error) provided by the linear probe. 
Finally we use a second linear probe to estimate the rotation angle ¢,), 
with which we can estimate the damage to the atomic magnetization, 
1 =1— ¢,//@ 1, caused by the nonlinear probe. 

The linear probe is calibrated using quantitative absorption imaging 
to measure Na, and we find that A(4,) = 3.3(1) X 10° rad per atom. 
The calibration of the nonlinear probe against the first linear probe is 
shown in Fig. 2: We repeat the above pump-probe sequence while 
varying Na in the range 1.5 X 10° to 3.5 X 10° to generate a 1 -vs-@N1. 
correlation plot for a given value of Ny. Because both #, and yz are 
linear a Na, we use linear regression to find the slope, b = déyz/ 
d¢y = B(Ap)Nnx/A(Az), for that value of Nyy. The experiment is 
hae for a range of different Nyy, values. 

The observed plot of b versus Nxz, shown in Fig. 2a, is well fitted by a 
simple model including saturation of the nonlinear response: 
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magnetization, detected by a shot-noise-limited polarimeter (PM). The atom 
number is measured by quantitative absorption imaging (AI). b, Spectral 
positions of the pump, probe and imaging light on the D> transition. 


Here N&a =6.0(8) x 10’ is a saturation parameter and the non- 


linear coupling strength is B(A,) = 3.8(2) X 107° rad per atom per 
photon. 

The noise in the nonlinear probe, again as a function of Ny, is 
determined from the ¢,-vs-yz correlation plots. As illustrated in 
Fig. 2b, c, the residual standard deviation of the fits indicates the 
observed uncertainty, A@yz, which includes the intrinsic uncertainty, 
Sdn and a small open ue from electronic noise. In Fig. 3, we plot 
the fractional sensitivity, aE) / (F, )> versus Nwz, calculated using 
equation (2) and considering the whole polarized ensemble, with 
(F, ) =7 x 10° spins. In agreement with equation (1), the log-log slope 
indicates the scaling SF; NU) ocNyy’~ to within experimental un- 
certainties in the range Ny; = 10° to 10’, and super-Heisenberg scal- 
ing, that is, steeper than N “1 over two orders of magnitude 
(Nxt = 5 X 10° to 5 X 10’). 

Results of numerical modelling using the Maxwell-Bloch equations 
to describe the nonlinear light-atom interaction are also shown in 
Fig. 3. Two curves are shown, for detunings 4) + (2m X 200 kHz), 
covering the combined uncertainty in 4 due to the probe laser line- 
width and inhomogeneous light shifts in the optical dipole trap. As 
expected from equation (1), this alters the sensitivity only at low Nyy 
values. The model is described in detail in Supplementary Information. 
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Figure 2 | Calibration of nonlinear Faraday rotation. a, Ratio of the 
nonlinear rotation, yz, to the linear rotation, ¢ :, versus the nonlinear probe 
photon number, Nyy. The data points and error bars indicate best-fit and 
standard errors from a linear regression, ¢y, = bf, + const., for given values of 
Nyx. The red curve is a fit using equation (2), showing the expected nonlinear 
behaviour, dy, < Nyz, with some saturation for large values of Nyx. b, ¢, p1-vs- 
én correlation plots for two values of Nn. The atom number, Na, is varied to 
produce a range of #; and #yy, values. Green, no atoms (Na = 0); red, 

1.5 X 10°< Ny, <3.5 X 10°; blue, Na ~ 7 X 10°. The blue circles are shown as a 
check on detector saturation, and are not included in the analysis. 
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Figure 3 | Super-Heisenberg scaling. Fractional sensitivity, 3B / (F.), of 
the nonlinear probe plotted versus the number of interacting photons, Nyr. 
Blue circles indicate the measured sensitivity, orange curves show results of 
numerical modelling, and the black lines indicate SQL, Heisenberg-limit and 
super-Heisenberg (SH) scaling for reference. Scaling surpassing the Heisenberg 
limit, o Nxj!, is observed over two orders of magnitude. The measured damage 
to the magnetization, 7, shown as green circles, confirms the non-destructive 
nature of the measurement. Error bars for standard errors would be smaller 
than the symbols and are not shown. 


For photon numbers above Ny ~2 x 10’, the saturation of the non- 
linear rotation alters the slope. This can be understood as optical 
pumping of atoms into states other than |F = 1, mz = 1) by the non- 
linear probe. The damage to the atomic magnetization, 7 = 1 — ¢'/¢1, 
also shown in Fig. 3, remains small, confirming the non-destructive 
nature of the measurement. The finite damage even for small Nyy 
values is possibly due to stray light and/or magnetic fields disturbing 
the atoms during the 20-ms period between the two linear measure- 
ments. For large N, higher-order nonlinear effects including optical 
pumping limit the range of super-Heisenberg scaling. 

The experimental results illustrate the subtle relationship between 
scaling and sensitivity in a nonlinear system. For an ideal nonlinear 
measurement, the improved scaling would guarantee better absolute 
sensitivity for sufficiently large N values. Indeed, when the measure- 
ment bandwidth is taken into account, the nonlinear probe overtakes 
the linear one at N = 3.2 X 10°, where both achieve a sensitivity of 
1.1 X 10° spinsHz '. As a consequence, the nonlinear technique 
performs better in fast measurements. In contrast, when measurement 
time is not a limited resource, the comparison can be made on a 
‘sensitivity-per-measurement’ basis and the ideal crossover point, of 
3.2 X 10° spins at N= 8.7 X 10’, is never actually reached, owing to the 
higher-order nonlinearities. Evidently, super-Heisenberg scaling allows 
but does not guarantee enhanced sensitivity: for the nonlinear tech- 
nique to overtake the linear, it is also necessary that the scaling extend to 
large enough values of N. This example shows also that resource con- 
straints dramatically influence the comparison between the linear and 
nonlinear techniques. See also Supplementary Information. 

We have experimentally realized a system designed to achieve 
metrological sensitivity beyond the Heisenberg limit, 5.V oc N~ ', using 
metrologically relevant interactions among particles. To generate pair- 
wise photon-photon interactions, we use fast, nonlinear optical effects 
in a cold atomic ensemble and measure the ensemble magnetization, 
(F z)» with super-Heisenberg sensitivity 5F, oc N” *? To quantify the 
photon-photon interaction and the sensitivity rigorously, we calibrate 
against a precise, non-destructive, linear measurement of the same 
atomic quantity*®, demonstrate quantum-noise-limited performance 
of the optical instrumentation and place an upper limit on systematic, 
that is, non-atomic, nonlinearities at the level of a few per cent. The 
experiment demonstrates the use of interparticle interactions as a new 
resource for quantum metrology. Although possible applications to 
precision measurement will require detailed study, our experiment 
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shows that interactions can produce super-Heisenberg scaling and 
improved precision in a quantum-limited measurement. 


METHODS SUMMARY 


Linear and nonlinear probe light. The probe beam is aligned on the axis of the 
trap with a waist of 20 jtm, chosen to match the radial dimension of the cloud. In 
the linear probing regime, we use a train of 40 1-1s pulses, with a repetition rate of 
100kHz, each containing 3X 10° photons detuned by +1.5GHz from the 
F=1->F' =0 transition. The maximum intensity is 0.1 Wcm *. The signals 
are summed and can be considered a single, modulated pulse. 

The nonlinear probe consists of a single, Gaussian-shaped pulse with a full- 
duration at half-maximum of 54ns. The maximum intensity of the nonlinear 
probe is 7 Wcm ” for a pulse with 10’ photons. Theory predicts that a) = 0 at 
a detuning of A = 2m X 462 MHz in free space. This is modified by trap-induced 
light shifts, and we use the empirical value 4g = 2m X 468.5 MHz, which gives zero 
rotation at low probe intensity. 

Instrumental noise. The instrumental noise is quantified by measuring var(S,) 
versus input photon number N (that is, N; or Nyx), in the absence of atoms, to find 
contributions from electronic noise (V“" « N°), shot noise (N') and technical 
noise (x N’), as described in Supplementary Information. We find that the con- 
tributions from electronic noise to the linear vie? ) and nonlinear ( vo ) probes are 
3 X 10° and 4 X 10° per pulse, respectively, and that the technical noise is negligible. 
The instrumentation is thus shot-noise-limited over the full range of N used in the 
experiment. The intrinsic rotation uncertainty of the nonlinear probe, d@yz, is 
calculated from the measured Ady as (8, )” = (Ady)? — vee), The correction 
is at most 5%, 

Instrumental linearity. The linearity of the experimental system and analysis is 
verified by using a wave plate in place of the atoms to produce a linear rotation 
equal to the largest observed nonlinear rotation. Over the full range of photon 
numbers used in the experiment, the detected rotation angle is constant to within 
5%, and SQL scaling is observed. 
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Intense femtosecond (107 '” s) light pulses can be used to transform 
electronic, magnetic and structural order in condensed-matter sys- 
tems on timescales of electronic and atomic motion’””. This tech- 
nique is particularly useful in the study*® and in the control® of 
materials whose physical properties are governed by the interac- 
tions between multiple degrees of freedom. Time- and angle- 
resolved photoemission spectroscopy is in this context a direct 
and comprehensive, energy- and momentum-selective probe of 
the ultrafast processes that couple to the electronic degrees of 
freedom’. Previously, the capability of such studies to access 
electron momentum space away from zero momentum was, 
however, restricted owing to limitations of the available probing 
photon energy’®”’. Here, using femtosecond extreme-ultraviolet 
pulses delivered by a high-harmonic-generation source, we use 
time- and angle-resolved photoemission spectroscopy to measure 
the photoinduced vaporization of a charge-ordered state in the 
potential excitonic insulator 1T-TiSe, (refs 12, 13). By way of 
stroboscopic imaging of electronic band dispersions at large 
momentum, in the vicinity of the edge of the first Brillouin zone, 
we reveal that the collapse of atomic-scale periodic long-range 
order happens on a timescale as short as 20 femtoseconds. The 
surprisingly fast response of the system is assigned to screening 
by the transient generation of free charge carriers. Similar screen- 
ing scenarios are likely to be relevant in other photoinduced solid- 
state transitions and may generally determine the response times. 
Moreover, as electron states with large momenta govern fun- 
damental electronic properties in condensed matter systems’, we 
anticipate that the experimental advance represented by the pre- 
sent study will be useful to study the ultrafast dynamics and micro- 
scopic mechanisms of electronic phenomena in a wide range of 
materials. 

The electronic properties of many condensed matter systems are 
determined by large-momentum electron states, often located near 
the edge of the first Brillouin zone (BZ), the unit cell of crystalline solids 
in electron momentum (k) space. A prominent current example is gra- 
phene, whose hallmark, the critical crossing point of its peculiar conical 
band dispersion, is located at the corner of the BZ’*. The copper oxide 
based high-temperature superconductors represent another example, 
where the much debated competition between the pseudogap and 
superconductivity is particularly pronounced in the so-called antinodal 
region of the Fermi surface, near the BZ boundary”®. A final example is 
the new class of iron-based superconductors, which are characterized by 
magnetic excitations that couple two sets of Fermi surfaces, one set 
centred on the corners of the BZ’. More generally, it is the coupling 
of high momentum electron states near the Fermi momenta kp (the 
momenta of the highest occupied electron states) that contributes most 
to the linear response function of an electron liquid, sometimes even 
causing a divergence that leads to phase instabilities. 


Typical electron momenta near the boundary of the first BZ are in 
the 1 A”! regime. Conventional angle-resolved photoemission spec- 
troscopy (ARPES) with photon energies exceeding roughly 10 eV is 
probably the most powerful tool to map band structure peculiarities and 
Fermi surfaces up to and beyond these critical points: this technique can 
also, at the same time, determine many-body effects embodied in the 
fine details of band dispersions and in the distribution of spectral 
weight. ARPES is particularly well suited for layered materials—as in 
the present study—because for quasi-two-dimensional systems, the 
measured electronic structure can be considered as being predomi- 
nantly characteristic of the bulk, despite the surface sensitivity of the 
probe. The great allure of corresponding time-resolved ARPES 
experiments is the provision of direct dynamical information and 
the possibility of disentangling—via temporal discrimination—the 
various interactions between the relevant degrees of freedom that 
determine material properties in the quantum world. Here, we present 
femtosecond time-resolved ARPES experiments, in which transient 
changes in the whole occupied electronic structure between the centre 
and the edge of the BZ are probed to answer a simple fundamental 
question: how fast can long-range charge order in a solid melt? 

The charge-ordered state we investigate is the conspicuous 
(2 X 2 X 2) charge-density wave (CDW) that occurs in the layered 
compound 1T-TiSe, below a temperature of 200 K (ref. 13). Figure 1 
presents the thermal equilibrium view of the CDW transition, which 
affects both structural and electronic properties. On the transition, the 
atoms move to new equilibrium positions such that the real-space unit 
cell doubles its size in all three directions (Fig. la and c; arrows in Fig. 1c 
indicate the atomic displacements from the normal-phase positions); 
in momentum space, the dimensions of the BZ are correspondingly 
halved (Fig. 1b and d). The new k-space geometry suggests that the 
wave vector of the CDW is determined by an interaction between the 
Se 4p valence band maximum at T, the centre of the BZ, and the 
elliptical pocket of Ti 3d states at the BZ edge at M (Fig. 1b). These 
symmetry points are connected by the new reciprocal lattice vectors 
and become equivalent I points in the new phase, which allows direct 
Se 4p - Ti 3d interaction (Fig. 1d). This interaction is in fact remarkably 
strong and extremely well resolved by conventional ARPES'*’*. 
Figure le and f compares ARPES intensity maps recorded with syn- 
chrotron radiation (hv = 119eV) along the IT —M direction above 
and below the CDW transition temperature. At T, the downward 
dispersing (hole-like) Se 4p bands dominate the photoemission signal 
in both maps and only a small shift of the valence band maximum is 
visible. At M, where the high-temperature map shows the bottom of 
the upward dispersing (electron-like) Ti 3d band, the effects are more 
dramatic: the CDW leads to a strong selective transfer of spectral 
weight by folding the Se 4p band from I’ onto M. It is this remarkably 
strong folded Se 4p intensity which we will use in the time-resolved 
experiments as a spectroscopic measure for CDW order in 1T-TiSe. 
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Figure 1 | CDW phase transition of 1T-TiSe,. a, Real-space unit cell of the 
normal phase at room temperature. b, Common unit cell (first BZ, purple line) 
in momentum space of the room temperature phase (two-dimensional 
projection onto the surface plane). , M and K are high-symmetry points of 
the first BZ. Blue ellipses and yellow circle indicate the Fermi surface topology 
of Ti 3d and Se 4p bands, respectively. Planes show the atomic layers of the two- 
dimensional crystalline structure. c, Real-space unit cell of the CDW phase. 


We now apply sub-10-fs extreme ultraviolet (XUV) pulses 
(hy = 43 eV, s-polarization) to monitor the transient response of the 
CDW phase to excitation with infrared laser pulses (hv = 1.57 eV) of 
32 fs width. Details of the time-resolved ARPES experiment are 
described in Methods. Figure 2a shows ARPES intensity maps of 
1T-TiSe, measured at T= 125K with the femtosecond XUV light 
source” (Methods). Despite the poorer energy resolution, both sets 
of Se 4p bands are well resolved, the original one at I’ and the folded 
one at M. Time-resolved experiments are performed at infrared pump 
fluences between 0.2 and 5 mJcm , corresponding to an excitation 
density range of 0.025 to 0.63 photons per Ti atom. Figure 2b-e shows 
four photoemission snapshots recorded at a pump fluence of 5 mJ cm™ 
with increasing temporal delay between the infrared pump and the 
XUV probe, up to a maximum of 3 ps (Supplementary Movie 1). The 
data have been corrected for a space charge shift of 200 meV induced 
by the electron background because of multi-photon photoemission 
by the infrared pump pulse. The time series is dominated by two 
prominent changes in the photoemission intensity maps. First, in 
instantaneous response to the infrared excitation, an electron-like 
band appears, crossing the Fermi energy Ep and extending (at suffi- 
ciently small temporal delays) from M to I’. We observe here the 
transient generation of quasi-free charge carriers because of near- 
resonant Ti 3d - Se 4p excitation. Second, the downward-dispersing 
Se 4p band, folded onto the M point owing to the interaction with the 
CDW superlattice, disappears or is at least considerably reduced in 
intensity. This suggests that long-range order in the electronic sub- 
system breaks down on an ultrafast timescale. In the following, we 
restrict our quantitative analysis to the short-time (sub-100-fs) dynamics 
of this process. 

Figure 2f compares the temporal evolution of the integrated intensity 
of the folded Se 4p band—our spectroscopic measure for CDW order— 
for different pump fluences (see Supplementary Information section 3 
for details of the data analysis). Both breakdown and (partial) recovery 
of the signal (inset of Fig. 2f) are strongly dependent on the pump 
fluence. The fluence dependence of the time constant characterising 
the signal breakdown, Tse 4p, is shown in Fig. 3a: at the lowest fluences, 


E-E, (eV) 


Arrows indicate the atomic displacements from the normal-phase positions; 
the dashed lines indicate the extension of the unit cell in the normal phase. 
d, First BZ (green line) of the CDW phase. The folding of Se 4p and Ti 3d states 
is indicated. e, ARPES intensity map (electron binding energy versus 
momentum) of the room temperature phase. Photoelectron intensity is 
encoded in a false-colour scale. f, ARPES intensity map of the CDW phase. 


the initial drop in the signal is retarded by about 80 fs with respect to the 
laser pulse excitation. As the fluence increases, the response becomes 
continuously faster, and at the highest fluences the transient minimum 
in the folded Se 4p band intensity appears well within the 32-fs-long 
infrared pump laser pulse with an ultimate response time of 20 fs. For 
comparison, the dynamics associated with the initial population of the 
Ti 3d band due to absorption happens within the width of the infrared 
pulse for the entire pump fluence regime. Notably, for the highest 
excitation fluence, the folded Se 4p band follows this population 
dynamics without delay (Supplementary Information section 4). The 
partial recovery of the folded Se 4p intensity is observed on timescales of 
several hundreds of femtoseconds. Two-temperature model calcula- 
tions following reference 21 suggest that this recovery is mostly driven 
by thermalization of the electronic subsystem with the atomic lattice. 
In previous studies, it has been shown that the fundamental time- 
scales of photoinduced phase transitions are governed by bottlenecks 
associated with the characteristic response times of the relevant degrees 
of freedom, such as the oscillation period of neighbouring atoms” or 
the hopping rate of localized electrons between neighbouring sites”. 
The upper solid line in Fig. 3a marks for instance the expected short- 
time limit (75 fs) of the lattice response of 1T-TiSe, to a photoexcita- 
tion; this short-time limit is taken as one-quarter of the oscillation 
period of the high-frequency CDW amplitude mode™. The vaporiza- 
tion of a long-range-ordered state within 20 fs is in this context excep- 
tionally fast. The ultrafast timescale observed in the high-fluence 
regime points to a purely electronically driven process, whose response 
time, however, strongly depends on the excitation fluence. The absorp- 
tion of the light pulse initially increases the free charge carrier density n 
(electrons and holes), as can be seen in the instantaneous population of 
the Ti 3d band. This transient free carrier population, which is directly 
governed by the excitation fluence, links the time constant Tse 4p to a 
material specific timescale: quantum kinetic calculations have shown 
that the characteristic build-up time for carrier screening in response to 
an ultrashort laser excitation is the plasma oscillation period, 1,1 (refs 
25, 26), which scales with 1/./n. As shown in Fig. 3a, Tse 4p Closely 
follows such a 1/,/n dependence (a quantitative estimate of the 
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Figure 2 | Tracking the photoinduced transition 


by femtosecond time-resolved ARPES. a, ARPES 
intensity map of 1T-TiSe, recorded with high- 
harmonic XUV pulses in the CDW phase 
(temperature T = 125K). b-e, Time-resolved 
ARPES snapshots at increasing pump (infrared)- 


probe (XUV) temporal delays (pump-pulse 
fluence, 5 mJ cm” *). Energy distribution curves 
(EDCs) are provided in Supplementary 
Information 2. f, Photoemission transients of the 
folded Se 4p band for different infrared excitation 
fluences close to time zero. The infrared-XUV 


cross-correlation signal, which has been 
determined from the laser assisted photoemission” 
(LAPE) signal, is added (Methods). Inset, 
transients up to temporal delays of 3.5 ps. 
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photoinduced carrier density n and the corresponding plasma oscil- 
lation period is given in Supplementary Information section 5). It has 
in fact been shown that the build-up of screening by photo-injected 
carriers is relevant for the dynamics of ultrafast processes on the sub- 
100-fs timescale”’. 

Suppression of screening is important in correlation-induced phase 
transitions. Therefore, it is not surprising that the reverse—the 
enhancement of screening by photo-injection of free carriers—can 
destroy a correlation-induced phase. The CDW transition in 1T-TiSe, 
has repeatedly been associated with an excitonic insulator instability'’*”’. 
In this model, the transition is driven by the spontaneous formation of 
excitons, which can occur in semiconductors or semimetals when the 
bandgap or band overlap becomes smaller than the exciton binding 
energy. The exciton formation requires a low concentration of mobile 
charge carriers and a correspondingly poorly screened Coulomb inter- 
action. In the case of 1T-TiSe2, the narrow gap/overlap system would 
become unstable to the formation of Ti 3d - Se 4p excitons and would 
exhibit a new periodicity governed by the wave vector connecting the 
corresponding valence and conduction band pockets. This purely elec- 
tronic process seems consistent with the measured ultrafast response 
times and a screening-based interpretation, as bound excitons would 
certainly be screened by the photo-injected carriers. For a transition 
temperature of 200 K, energy-time uncertainty yields a response time 
of 35 fs for such a purely excitonic process (see lower solid line in 
Fig. 3a). More generally, however, our results are in line with the 
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screening of any (unspecified) interaction between Se 4p and Ti 3d 
states described by an effective matrix element V.,°. The spectral 
weight of the folded Se 4p state will scale with V.,° and we can estimate 
the effect of screening within the Thomas-Fermi approach". This 
yields Ver? = (1 + const. X n”3), in reasonable agreement with our 
experimental data (see Fig. 3b). A deeper analysis, particularly with 
respect to the puzzling story of the CDW phase transition in 1T-TiSeo, 
requires a sophisticated theoretical description which, for instance, 
considers the quantum kinetics of screening in non-equilibrium sys- 
tems or effects arising from the photo-doping of excitonic insulators. 

Wefinally address the question of to what extent the screening by the 
nascent carriers affects the properties of 1T-TiSe, on the 20-fs timescale. 
As discussed above, the response of the atomic lattice is slow so that 
structurally the sample will still resemble the CDW phase on this time- 
scale. However, the valence electronic structure becomes substantially 
modified as soon as screening becomes effective. In fact, electronically, 
the system undergoes a transition from a poorly conducting CDW state 
into a metallic phase within 20 fs. This is not only implied by the 
screening scenario but is also directly visible in Fig. 2d, which shows 
the transient metallization of the Ti 3d band only 30 fs after the optical 
excitation. 

The ultrafast breakdown of long-range charge order that we report 
here is much faster than the material-characteristic oscillations of col- 
lective modes that are commonly thought to limit the response times in 
photoinduced processes. This surprising result may therefore stimulate 
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Figure 3 | Fluence dependence of the photoinduced transition. a, Signal 
drop time, ts 4p, of the folded Se 4p band (filled red circles) as a function of the 
absorbed pump fluence. The ~ 10% error bars in the absorbed pump fluence are 
determined from the uncertainty in the optical constants of 1T-TiSe, the 
stability of the pump laser, and the uncertainty in the pump pulse diameter at 
the sample position. The error bars in the drop time indicate the standard 
deviation of the fits to the data. b, Minimum transient spectral weight in the 
folded Se 4p band (red dots) as a function of absorbed fluence. Equation of 
curve fitted to data is shown in each panel. a.u., arbitrary units. 


new concepts for ultrafast switching devices. Furthermore, in our study 
we have essentially monitored the intensity of a superlattice Bragg peak 
in electron momentum space. Thus, from a methodological point of 
view, the presented time-resolved ARPES approach complements pre- 
sent-day time-resolved diffraction experiments”, with the advantage 
of an exceptionally high temporal resolution. 


METHODS SUMMARY 

Sample preparation. 1T-TiSe, single crystals were grown from the elements by 
chemical vapour transport using iodine as transport agent. Before the photoemis- 
sion measurements, the samples were cleaved in situ at room temperature in 
ultrahigh vacuum. 

Photoemission measurements. Static ARPES experiments were conducted at 
beamline 7.0.1 of the Advanced Light Source at Berkeley with a Scienta R4000 
electron spectrometer. The photon energy was 119eV and the overall energy 
resolution was ~30 meV. Femtosecond time-resolved ARPES measurements were 
conducted at the University of Kiel with a SPECS Phoibos 150 electron spectro- 
meter. Here, the photon energy was 43 eV and the overall energy resolution in the 
experiment was ~400 meV. The light source for the time-resolved experiments 
was an argon-filled hollow-fibre waveguide (KUUS, KMLabs) for high harmonic 
generation operated with a 3 kHz Ti:sapphire amplifier system (Dragon, KMLabs, 
pumped by an Empower 30, Spectra Physics). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Sample preparation and photoemission set-up. 1T-TiSe, single crystals were 
grown from the elements by chemical vapour transport using iodine as transport agent. 
Samples were mounted ona cryogenic manipulator and cleaved in situ under ultrahigh 
vacuum (UHV) conditions at a base pressure of 3 X 10~ 10 mbar. Conventional 
ARPES measurements were conducted at beamline 7.0.1 of the Advanced Light 
Source at Berkeley, with a Scienta R4000 electron spectrometer. The photon energy 
was 119 eV and the overall energy resolution was ~30 meV. Femtosecond time- 
resolved ARPES measurements were conducted at the University of Kiel. 
Photoemitted electrons were detected using a hemispherical electron energy analyzer 
(SPECS, Phoibos 150) equipped with a two-dimensional detection unit for parallel 
energy and momentum detection. The total energy resolution of the experiment was 
mainly governed by the spectral broadening of the femtosecond XUV pulses and was 
determined to be ~400 meV. An independent characterization of the high harmonic 
radiation with a grating spectrometer showed that the spectral width of the used 27th 
harmonic was 340 + 40 meV. Typical integration times for analysis-grade spectra 
were 3 min. High-quality data as shown in Fig. 2 required an integration time of 15 min. 


Pulsed XUV light source. The light source used for the pump-probe photoemis- 
sion experiments was a 3-kHz Ti:sapphire amplifier system (Dragon, KMLabs, 
pumped by an Empower 30, Spectra Physics) delivering infrared pulses at 
790nm, 1.2 mJ pulse energy and 32 fs pulse duration. For photoemission, 80% 
of the pulse energy was used to generate high harmonic femtosecond XUV pulses 
in an argon-filled hollow-fibre waveguide (KUUS, KMLabs). A pair of multilayer 
mirrors (total reflectivity, 13%) selected the 27th harmonic (hv = 43 eV) out of 
the harmonic spectrum and focused it at an angle of 45° onto the sample 
mounted in the UHV system. The intensity of the 27th harmonic at the sample 
position was measured in situ using a calibrated XUV photodiode (SXUV 20 HS1, 
International Radiation Detections) yielding a fluence in the 10’ photons s ! 
regime. The residual 20% of the amplifier output was available for infrared 
photoexcitation of the 1T-TiSe, sample. The temporal pulse profile of the pump 
beam was characterized using the frequency-resolved optical gating technique. 
The pulse-width of the XUV pulses was estimated to a value <10fs from the 
LAPE infrared-XUV cross-correlation traces of the Se 4p band signal of the 
1T-TiSe, sample” (Fig. 2f). 
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Fault lubrication during earthquakes 


G. Di Toro’?, R. Han®, T. Hirose’, N. De Paola”, S. Nielsen’, K. Mizoguchi®, F. Ferri’, M. Cocco? & T. Shimamoto’ 


The determination of rock friction at seismic slip rates (about 
1ms°') is of paramount importance in earthquake mechanics, as 
fault friction controls the stress drop, the mechanical work and the 
frictional heat generated during slip’. Given the difficulty in deter- 
mining friction by seismological methods’, elucidating constraints 
are derived from experimental studies” °. Here we review a large set 
of published and unpublished experiments (~300) performed in 
rotary shear apparatus at slip rates of 0.1-2.6ms_'. The experi- 
ments indicate a significant decrease in friction (of up to one order 
of magnitude), which we term fault lubrication, both for cohesive 
(silicate-built**, quartz-built® and carbonate-built”*) rocks and 
non-cohesive rocks (clay-rich’, anhydrite, gypsum and dolomite” 
gouges) typical of crustal seismogenic sources. The available mech- 
anical work and the associated temperature rise in the slipping zone 
trigger'’’? a number of physicochemical processes (gelification, 
decarbonation and dehydration reactions, melting and so on) 
whose products are responsible for fault lubrication. The similarity 
between (1) experimental and natural fault products and (2) mech- 
anical work measures resulting from these laboratory experiments 
and seismological estimates'*'* suggests that it is reasonable to 
extrapolate experimental data to conditions typical of earthquake 
nucleation depths (7-15 km). It seems that faults are lubricated 
during earthquakes, irrespective of the fault rock composition 
and of the specific weakening mechanism involved. 

The evolution of friction (shear stress, t) during earthquakes and the 
dynamic friction coefficient, , are key parameters in controlling 
seismic fault slip and radiated energy’”’. In the past 40 years, experi- 
ments performed in triaxial and biaxial apparatuses under conditions 
oflow slip rates (V< 1mm s ') and modest displacements (6 < 1 cm) 
have shown that the friction coefficient in cohesive and non-cohesive 
rocks is about 0.7 (ref. 15) irrespective of the rock type (with a few 
exceptions that are of great relevance for the mechanics of mature 
faults), and that frictional instabilities of a few per cent'®””, described 
by rate-and-state friction laws'’, are associated with earthquake ini- 
tiation. Although the above results are consistent with several seismolo- 
gical and geophysical observations'”", the experiments were performed 
at slip rates and displacements orders of magnitude smaller than those 
typical of earthquakes” (0.1-10ms | and up to 20m, respectively). 
Given the low slip rates, these experiments lack a primary aspect of 
natural seismic slip: a large mechanical work rate (or instantaneous 
power density, ®(t) = 1(t)V(t)) within the slipping zone'*. The work 
rate can be so large as to grind and mill the rock (producing particles of 
nanometric size, or nanopowders), trigger mechanically and thermally 
activated’ chemical reactions, and, eventually, melt the rock. Under 
these extreme deformation conditions, the fault surfaces are separated 
by fluids or other tribochemical products (for example melt, gel, nano- 
powders and decarbonation products). Work rate (not work alone) is 
the key parameter, as a given amount of work exchanged at a slow rate is 
buffered by dissipative processes and hence produces limited reactions. 

In the past 15 years, the installation and exploitation of rotary shear 
apparatus~*° designed to achieve the larger slip rates and displace- 
ments typical of earthquakes produced unexpected experimental 


results. Among these, the most surprising is the dramatic drop in 
friction (of up to 90% in most cases) at seismic slip rates, independent 
of the rock type and the weakening mechanism used. 

Here we report about 300 published and unpublished high-velocity 
rock friction experiments performed in the rotary shear apparatuses at 
Brown University’ and at Kyoto University’ (now at the Kochi 
Institute for Core Sample Research, JAMSTEC). These experiments 
were performed at room humidity with 0.1 ms '}<V<26ms 1, 
6 >2m and 0.6 MPa < o, < 20 MPa (normal stress) on solid cylin- 
ders (22 and 40 mm in diameter) or hollow cylinders (15/25, 15/39, 27/ 
39 and 40/50 mm in internal/external diameter) in the case of cohesive 
rock and on gouge layers confined by Teflon rings’ in the case of non- 
cohesive rock. Figure 1 summarizes the friction coefficient as a func- 
tion of normalized displacement in experiments performed at seismic 
slip rates for cohesive and non-cohesive rocks: both show a similar 
exponential decay of friction from a peak (P) to a steady-state value 
(SS). In all the experiments, friction decreases significantly with 
increasing slip. Here we introduce the thermal slip distance, Dy, 
defined as the slip distance over which the friction coefficient decays 
to a value py, = [ss + (fy — Hss)/€ (the experimental data are fitted by 
an exponential decay from a peak value, Hp; toa steady-state value, [Uss). 


Pe HVR1360 - gypsum gouge (De Paola et a/., unpubl.) 

@,, = 0.80 MPa, V= 1.30 m s”! (flash heat., nanop. lubr., dehydr. & therm. press.) 
HVR1138 - anhydrite gouge (De Paola et a/., unpubl.) 

o,, = 0.82 MPa, V = 1.30 ms“ (flash heat. & nanop. lubr.) 

HVR1161 - dolomite gouge (ref. 10) 

o,, = 0.81 MPa, V= 1.30 m s” (flash heat., nanop. lubr., decarb. & therm. press.) 
HVR687 - gabbro (ref. 6) 

0, = 15.5 MPa, V=1.14m s™ (melt lubrication) 


1.0 


0.8 


0.6 HVR178 - clay-rich fault gouge (ref. 9) 


o,, = 0.6 MPa, V = 1.03 m s” (flash heat., nanop. lubr. & dehydr.) 
HVR719 - serpentinite (Hirose & Bystricky, 2007) 

0, = 2.6 MPa, V= 1.14 ms“ (flash heating & dehydr.) 
HVR439 - marble (ref. 7) 0, = 12.1 MPa 


0.4 
V=1.14ms" (nanop. lubr. & decarb.) 


Friction coefficient 


0.2 


0.0 


Normalized slip, slip/D,, 


Figure 1 | Friction coefficient versus normalized slip. Shear stress and slip 
were normalized with respect to normal stress and the thermal slip distance, 
Dy respectively. The displacement was normalized because experiments 
performed with different rocks and under different normal stresses had 
different D,, values (Supplementary Information, section 4, and Fig. 2). The 
friction coefficient decays exponentially with slip from a peak (P) at the 
initiation of sliding to a steady-state (SS) value. The weakening mechanisms 
that we assume to be dominant are shown in parentheses (flash heat., flash 
heating; nanop. lubr., nanopowder lubrication; dehydr., dehydration reaction; 
decarb., decarbonation reaction; therm. press., thermal pressure). For all the 
weakening mechanisms, the friction coefficient in the steady state is <0.3. 
Unpubl., unpublished experimental data. See Supplementary Information, 
section 2, for the reference to Hirose and Bystricky, 2007. 
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© Tonalite (ref. 4) 
a ©) Cataclasite (ref. 23) 
@ Gabbro (ref. 6) 
16 ® Monzodiorite (Mizoguchi & Hirose, unpubl.) 
BB Peridotite (Del Gaudio et al., 2009) 
@® Carrara marble (ref. 7 and 8) 


12 = Di, = ao, @ Clay-rich gouge (ref. 9) 

E @ Vaiont clay-rich gouge (Ferri et a/., 2010; unpubl.) 
ee A Anhydrite dry gouge (De Paola et a/., unpubl.) 
ag @ Dolomite dry gouge (ref. 10) 

8 © Gypsum dry gouge (De Paola et al., unpubl.) 

A Dolomite (Ref. 8) 
4 
0 


Normal stress (MPa) 


Figure 2 | The thermal slip distance, D,,, versus normal stress from 
experiments performed with V= 1-1.6ms~'. This plot shows that different 
rocks (and weakening mechanisms) have different Dj, values and that Di, 
decreases with increasing o,, (power-law dependence with a general form 

Din =ae, » where a and b are experimentally determined coefficients). For 
peridotite, a = 78 and b = 1.24; for anhydrite, a = 3 and b = 1.13. By 
extrapolating the best-fit curves to seismogenic depths (¢,, > 200 MPa), we 
predict that Dy, ~ 7 cm and ~0.5 cm for peridotite and anhydrite, respectively. 
It follows that low values of friction are easily achieved during earthquakes in 
nature. See Supplementary Information, section 2, for references to Del Gaudio 
et al., 2009, and Ferri et al., 2010. 


The thermal slip distance varies significantly between experiments 
performed on different rocks under the same experimental conditions 
(mainly the normal stress and the slip rate; Supplementary Informa- 
tion, section 4), but for a given rock type Dj, seems to decrease accord- 
ing to a power law for increasing normal stress (Fig. 2). For instance, in 
the case of peridotite, D,, decreased from 13 m for ¢, = 5 MPato 1.3m 
for 6, =20MPa. The extrapolation of these data to seismogenic 
depths (>7 km), where normal stresses acting on the fault can be 
>200 MPa, suggests that D,, should decrease to less than 10cm. 
Recent theoretical work® and field evidence* suggest also that Dy, 
can decrease to a few centimetres in the presence of melts. 

In Fig. 3, we summarize about 300 high-velocity (V>0.1ms° ') 
experiments performed on cohesive (for example novaculite, marble, 
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serpentinite, gabbro and tonalite) and non-cohesive rocks (anhydrite, 
gypsum, dolomite and clay-rich gouges) typical of crustal seismogenic 
sources (for data, see Supplementary Information, section 2). For each 
rock type there is a drastic decrease in friction that we interpret as the 
result of mechanically and thermally activated (that is, tribochemical), 
often coexisting, weakening mechanisms (see Supplementary Informa- 
tion, section 3, for a summary of the weakening mechanisms). For 
instance, in calcite-built rocks such as marbles, a significant weakening 
(to = 0.03) was concurrent with the emission of CO, from the 
slipping zone and the production of lime and portlandite nanopowders”*. 
The maximum measured average temperature was about 900 °C, which 
is consistent with the thermally activated breakdown temperature for 
CaCO3—> CaO + CO>. The measured weakening has been attributed”* 
to thermal decomposition and the production of nanopowders. For melt 
lubrication, we considered only experiments performed at 0, > 5 MPa 
because the nonlinear dependence of shear stress on normal stress in this 
case results in an overestimate of the friction coefficient at low normal 
stresses®. The most striking features of Fig. 3 are the velocity dependence 
of friction at high slip rates, the low steady-state friction at seismic slip 
rates and the tendency of friction to cluster around a coefficient of 
about 0.2-0.4 for V=1ms /, independently of the invoked weaken- 
ing mechanism. 

Figure 3 does not provide information on the mechanical work rate 
generated in the slipping zone, which we argue is a key parameter in 
controlling the rate of temperature increase (dT /dtoc &(t): if a large 
amount of work is generated locally within a short time then diffusive 
heat loss is confined to a small rock volume, resulting in a large tem- 
perature rise in the slipping zone”) and the onset of mechanically and 
thermally activated weakening processes. For simplicity, we introduce 
an equivalent shear stress, tT. = const., to determine the equivalent 
power density, ®.~t,.V. This constant stress is a thermally balanced 
average of the variable stress, t(t), over a time interval ty, = Dy,/V. It is 
defined such that both t, and t(f) yield the same temperature increase 
on the fault over a time t,,. If we assume that heat diffuses from an 
infinitesimally thin shear zone and use an exponential stress decay 
(Fig. 1), we obtain (Methods) 


_ VrErfi(1) 
ss T 
2e 
Here Erfi is the imaginary error function (Erfi(1) = 1.65043). 
The approximation of an infinitesimally thin layer is based on 
observations of both natural and experimental seismic faults, which 
show that the shear band active at high slip rates generally decreases in 
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Figure 3 | Steady-state friction coefficient versus slip rate. There is a marked 
decrease in the friction coefficient when seismic slip rates (~1m s ')are 

approached. Note that slip rate is plotted on a logarithmic scale here. For melt 
lubrication, we report only experiments performed with o,, > 5 MPa (see text). 


Data and extra references (Hirose and Bystricky, 2007, Ferri et al., 2010, Del 
Gaudio et al., 2009, Weeks and Tullis, 1985, Shimamoto and Logan, 1981, and 
Morrow et al., 2000) are in Supplementary Information, section 2. 
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thickness to a fraction of a millimetre, even for initially thicker gouge 
layers’ °?”. In Fig. 4, we plot the friction coefficient as a function of the 
power density for cohesive (Fig. 4b) and non-cohesive (Fig. 4c) rocks 
(for data, see Supplementary Information, section 5). Friction data still 
show an evident drop approaching the power densities expected in 
nature (Fig. 4a, grey area), but their scaling with power density 
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depends on the rock type. In particular, weakening of novaculite 
occurs at lower power densities than does weakening of calcite- and 
silicate-built rocks. Figure 4a suggests the activation of different 
dynamic weakening mechanisms associated with different work rates 
and temperature increases, AT, in the slipping zone. Assuming an 
exponential decay, AT can be estimated (Supplementary Information, 


Figure 4 | Steady-state friction coefficient versus 
power density. a, The different weakening 
mechanisms are activated at increasing power 
densities, t.V (here plotted on a logarithmic scale), 
which also reflect the different temperatures 
(equation (1)) achieved in the slipping zone. The 
box in red is the inset shown in b and c. The grey 
area is for power densities expected in nature. 

b, c, Red-boxed region of a for cohesive rocks 

(b) and non-cohesive rocks (c), showing curves of 
reaction speed constant, k (right-hand axis, where 
‘conc’ denotes concentration, for example 

mol dm °, and n is the order of the reaction), 
versus power density. Because k varies with 

V Dw /V for a given power density, t.V (equation 
(2)), for each rock type we plotted curves of 
reaction speed constant for the highest and lowest 
values of \/Din/V (Methods). In b, for cohesive 
rocks, novaculite weakening at low power densities 
is indicative of mechanically activated chemical 
reactions (amorphization and gelification of 
quartz’). Marble weakening is triggered at higher 
power densities, where thermally activated 
decarbonation reactions (green curves; we used E, 
for decarbonation of calcite nanoparticles; see 
Supplementary Information, section 5) are more 
effective. The increase in the reaction rate constant 
with power density (and temperature) implies the 
presence of more reaction products in the slipping 
zone, concomitant with the boost of the weakening 
mechanism. In c, for non-cohesive rocks, 
dehydration and decarbonation reactions are 
expected to occur at higher power densities (and 
temperatures) than those estimated in the slipping 
zone, suggesting that weakening during the 
transient stage is dominated by mechanically 
activated and local, thermally activated reactions 
and lubrication mechanisms (nanopowder 
lubrication and flash weakening). Data are in 
Supplementary Information, section 4, and extra 
references (Shimamoto and Logan, 1981, Weeks 
and Tullis, 1985, Del Gaudio et al., 2009, and Ferri 
0.0 et al., 2010) are in Supplementary Information, 
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section 4) once a significant friction drop has been triggered (that is, 


over a slip distance D,,): 
V /D 
el (a) 
ply V mKV 


Here p is the rock density, c, is the specific heat at constant pressure 
and x is the thermal diffusivity. Chemical reactions start once an 
energy barrier (the activation energy, E,) is overcome. The work rate, 
teV, and the temperature increase initiate tribochemical reactions 
governing dynamic fault weakening. Tribochemical reactions include 
mechanically and thermally activated, short-lived (10 3-10 ° s), local 
(<l um) processes operating during rubbing of the sliding sur- 
faces''!*?*77, which result in small quantities of reaction products. 
Rubbing induces plastic deformation, fracturing and amorphization 
(mechanically activated reactions) and exposes clean, highly reactive 
and catalytic surfaces (which lower the E, value of the reaction), as well 
as flash heating (a local and short-lived, thermally activated reaction) 
of the asperity contacts''’* (which increases the reaction rate). The Eq 
values of mechanically activated reactions are lower than the E, values of 
for thermally activated reactions, and when the reaction is simultaneous 
with milling, the reaction rate is nearly independent of the bulk temper- 
ature!?-24-76 (Supplementary Information, section 5). Eventually, the 
bulk temperature of the slipping zone increases, direct thermal activa- 
tion is more effective than mechanical excitation’” and larger quantities 
of reaction products are produced. At this stage, the reaction rate is 
proportional to the reaction speed constant, k (Arrhenius-type 
dependence on temperature”): 


k= Aexp( 


A T( tth Din / V) 


—E, 
R(Tamb + AT) 


aR, 
R(Tamb + (Te V/pcp) V Din /V7K) 


Here A is the pre-exponential factor, R is the gas constant and T,,np is 
room temperature. 

The plot in Fig. 4b, for cohesive rocks, distinguishes weakening 
resulting from mechanically and local thermally activated reactions 
(amorphization and gelification of quartz’) from that resulting from 
reactions dominated by thermal activation (decarbonation of calcite’). 
For a power density of 2 MW m_’, which corresponds to an estimated 
temperature increase of about 800-1,000 °C (see equation (1); Sup- 
plementary Information, section 4), in the range of measured tem- 
peratures’, we observe the limit between decarbonation and melting. 

For experiments performed on non-cohesive rocks, large weakening 
occurs before the thermally activated, Arrhenius-type reaction (equa- 
tion (2)) should dominate, suggesting the operation of other lubric- 
ating mechanisms (such as flash weakening at the asperity contacts'*”* 
and nanopowder lubrication*”’) controlled by mechanically activated 
and local, thermally activated reactions'*"*7*”°77, 

In summary, high-velocity experiments indicate that at seismic slip 
rates the friction coefficient decreases to about 10-30% of its initial 
value (Figs 1 and 3). The extremely low friction data reported in Fig. 3 
can occur in nature, implying large breakdown stress decreases 
(defined as the difference between the peak and the residual shear 
stress or, in the experiments, the steady-state shear stress) at least in 
some fault patches*. In fact, the thermal weakening distance, Di, 
decreases with increasing normal stress (Fig. 2): at typical crustal 
stresses (>20 MPa), Dy, < 1m, which is below the typical displace- 
ment for moderate to large earthquakes. Moreover, our experimental 
estimate of the breakdown work, W,, is consistent with the seismolo- 
gical observations. The breakdown work, which is one of the most 
robust earthquake source parameters retrieved from seismological 
data’’, is the seismological equivalent of the fracture energy", or the 
energy spent per unit fault area for the advancement of rupture. 
Following refs 13, 14, we define W, as the mechanical dissipation 


=Aexp 
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associated with the breakdown stress drop. In the case of an exponen- 
tial decay, for 6 >> Din 


6 


Wp= ic — Tss) exp ( = =) dd’~on(H, _ Hss)Dth (3) 
7 th 

the integral of the experimental shear stress curves. Equation (3) yields 
Wy, values ranging between 1 and 42 MJ m * for most of the activated 
weakening mechanisms (Supplementary Information, section 4). 
These values are in the range of seismological W,, estimates'>™ for 
moderate to large earthquakes (1 MJ m *<W,<100MJm ”). But 
earthquakes occur under higher normal stress than in these experi- 
ments; because Di, = ao, © (with 1.13<b< 1.24; Fig. 2), from equa- 
tion (3) we find that Wo (lp — Hss)ao, °*. As a consequence, we 
expect a decrease in the breakdown work of 30-35% for an increase 
in o,, from 20 to 200 MPa (that is, extrapolation of experimental data to 
natural conditions). Because field’! and theoretical’? investigations 
suggest that W, is mostly converted to heat, the comparison between 
natural and experimental W, values indicates that the individual pro- 
cesses governing dynamic fault weakening on experimental faults are 
similar to those governing natural faults. This is confirmed by the 
presence, in natural seismogenic faults, of fault products (solidified 
melts*”’, reaction products”, fluidized gouges’ and so on) similar to 
those produced in the experiments*”?”?. 

We conclude that the experimental work performed using high- 
velocity friction apparatus indicates that faults are lubricated when they 
are deformed at slip rates typical of earthquakes, independent of the rock 
and weakening mechanism involved. Because experiments have been 
performed at normal stresses (<25 MPa) lower than those expected in 
the Earth’s crust, some extrapolation is required to apply these results to 
seismogenic faults. A more direct verification may become available 
when new apparatus is developed that can perform experiments under 
the normal stresses expected in the crust (>50 MPa). 


METHODS SUMMARY 


The unpublished experiments cited in Figs 1-4 by De Paola et al., Di Toro and 
Hirose, Mizoguchi and Hirose, and Ferri et al. were performed in the rotary shear 
apparatus in Kyoto University (now at the Kochi Institute for Core Sample 
Research, JAMSTEC, Japan). Experiments were conducted at room humidity with 
0.1ms '}<V<1.3ms !andé>2monsolid cylinders (25 mm in diameter) or 
hollow cylinders (15/25 mm in internal/external diameter) in the case of cohesive 
rock and on 1-mm-thick gouge layers sandwiched between two solid rock cylinders 
and confined by Teflon rings? in the case of non-cohesive rocks. In experiments 
performed on non-cohesive rocks, the normal stress was in the range 
0.6 MPa < o, <2 MPa, given the low tensile strength of Teflon. For cohesive rocks, 
the normal stress was in the range 5 MPa <0, <20 MPa. The reliability of the 
experimental procedure and the quality of the unpublished and published experi- 
ments reported in this study are indicated by the observations that (1) the same rock 
had similar frictional behaviour in different rotary shear apparatuses (and also in the 
torsional Kolsky bar**, not discussed here; Supplementary Information, section 1); 
(2) different materials, such as metals, had different mechanical behaviour under 
similar experimental conditions (Supplementary Information, section 1); and (3) 
one data set (for novaculite*) covers the whole slip-rate interval (from 1 1m s ‘to 
0.1ms !)and the transition from high friction values (with results compatible with 
those obtained in conventional biaxial and triaxial apparatuses in the overlapping 
low-slip-rate range). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 
Estimate of the temperature in the slipping zone during the transient. The 
general solution for temperature, T(t), in a semi-infinite solid given a time-varying 
heat flow function,®(t), imposed at the boundary of the solid (z = 0) is** 
t 
1 | P(t—€) 
2p /mKJ VE 


0 
where the factor of two in the denominator of the prefactor accounts for bilateral 
heat diffusion in the wall rocks, p is the rock density, c, is the specific heat capacity 
and « is the thermal diffusivity. The experiments were performed for a constant slip 
rate (V(t) = V) and a variable shear stress, t(t). As a consequence, the heat flow is 


P(t) =(t)V (5) 


The shear stress evolution with slip, x, during the transient of length D,, is 
approximated by an exponential decay in the form 


T(t)= ent ae (4) 


Ux) = Tss +(Tp = Tose 2 Pe (6) 


where t,, and t, are the steady-state and peak shear stresses, respectively. Hence, 
for x = Dip, the thermally activated slip weakening distance over which a signifi- 
cant decrease in shear stress occurs, tT = 1,, + (tp — Tss)/e (Fig. 1). We remark that 
Dj» corresponds to one-third of the slip weakening distance proposed in ref. 35, 
where a slip weakening distance d. is introduced as the displacement at which 
Tp — Tss decreases to 5% of its initial value. We consider Dy, instead of d. because at 
Dy,» the shear stress weakening (that is, the lubrication of the slipping zone) is 
already significant (t, — t,, has decreased to 36.8%, or 1/e, of its initial value), 
indicating that the responsible chemical reactions or phase changes have been 
already triggered. Moreover, the mathematical description using 1/e is simpler. By 
replacing x with f in equation (6) and using x = Vt, we obtain 


T(t) = Tg + (tp —Trs Je VED 
and equation (5) results in 
0) = [t+ (p— tale V/V (7) 


From equation (7), equation (4) is 


T(t) 


t 2 ¢ 
al 6 —Tss —V(t—)/Din] We—@ /4xé 
{- +(Tp — Tse ]Ve dé (3) 


2plp/ 1K VE 


0 
On solving the integral and considering the temperature at the boundary z = 0, we 
obtain the temperature at time tf: 


V [2 Vit (ty —Tss)e~ MP \/mD gh /VExfi(/ Vi/Di) 
2ply /TK 


To determine the temperature at z = 0 (border of the slipping zone) after a slip 
Dy has occurred, we may use equation (9) to show that 


VDinV [2et.. + (Tp _ Ts3)\/TErfi(1)] 
2epep /TK 
where e = 2.7182 and Erfi(1) = 1.65043. The temperature estimated from equation 
(10) has to be added to the initial (ambient) temperature to determine the actual 
temperature in the slipping zone. Equation (10) should overestimate the temper- 
ature increase after a slip distance D,,, as it does not include the heat losses by 
radiation from the sample, interaction of the heated slipping zone with air (the air 
cools the sample as the sample rotates up to 1,500 r.p.m.), expulsion of hot material 
from the slipping zone (for experiments performed in cohesive rocks) or heat 
exchange due to the tribochemical reactions triggered during frictional sliding (for 
example, the enthalpy energy of decarbonation of calcite is about 1.8 MJkg '). 
However, the samples used in the experiments are cylinders or rings of rock. It 
follows that the determination of slip rate, V, in equation (10) is not straightforward, 
as V increases with sample radius, r (V = wr, where « is the rotary speed). The slip 
rate in the slipping zone is obtained in terms of ‘equivalent slip rate’ V, (refs 5, 36): 


__ 4mRr 
a) 


where R is the revolution rate of the motor. We refer to the equivalent slip rate simply 
as slip rate in our study. As a consequence, equation (10) estimates the average 


T(t) 


T(t=Dy/V) (10) 


Ve 
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temperature in the slipping zone, as it is calculated from the equivalent slip rate, 
V.. and, for instance, the actual temperatures achieved at the sample edge might be 
slightly higher than those predicted from equation (10). This might explain the small 
underestimation of temperature predicted with equation (10) with respect to the 
temperatures measured at particular points of the sample during the experiments’. 
Lastly, the estimate from equation (10) of the temperature increase in the slipping 
zone implies the determination of thermal parameters (for example thermal diffu- 
sivity) that vary with temperature and with rock texture. We conclude that equation 
(10) yields a rough but useful, at least for the purposes of this study, estimate of the 
temperature increase in the slipping zone. 

Equivalent shear stress during the transient stage. The equivalent shear stress, 
Te, is a constant shear stress that would yield the same temperature at the boundary 
z=0 at time fy, = Dy,/V as the approximate exponential decay observed in the 
experiments (Fig. 1). As a consequence, to determine t, we equated the temper- 
ature achieved under constant 1, to the temperature under the exponential decay 
(equation (8)): 

t 


1 [sea 
2pCp/mK ) JE 


0 


t re 
1 | [tss + (tp — Taso VE“ 8/ Pn] V 


d gE 
2pcp TK } VE i 
By simplifying and extracting t., we find that 
ob (ta + Gta“ 9/4] / V2) a8 
I (1/v6) ag 
Upon integration we obtain 
nErfi(1 
Te = Ts + zee dey Tss) (11) 


2e 
with e = 2.7182 and Erfi(1) = 1.65043. We used equation (11) in the power den- 
sity plots reported in the main text (Fig. 4). 
Plot of the reaction-speed-constant curves in the friction versus power density. 
In Fig. 4b, c, we plotted the reaction speed constant curves versus power density, 
t.V 
By combining equations (10) and (11), we find that the temperature increase in 
the slipping zone after a slip distance Diy is 


TeV Di 
Pop/tk V V 


As a consequence, the reaction speed constant is 


AT(t=Du/V) 


—E, 
ia la a 
2 
= (12) 


R( Tan +(teV /pep)/ Din KV) 


where Timp is the room temperature. Equation (12) allows us to determine the 
reaction speed constant for a given power density once D,, is known. Figure 4 
includes experiments performed over a broad range of normal stresses and slip 
rates. As reported in Fig. 2, for a given rock D,y decreases with increasing normal 
stress and, for a given power density, t.V, the reaction speed constant varies with 
VDm/V (equation (12)). We therefore considered the smallest and largest 
VV Di:/V values for a given rock type, resulting in the two reaction-speed-constant 
curves (for the same reaction) reported in Fig. 4b, c. The thermal properties (p, c, 
and x) of the different rocks (Supplementary Information, section 5) at a temper- 
ature of 300 K were considered in equation (12) and in Fig. 4b, c. 
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Duplications of the neuropeptide receptor gene 
VIPR2 confer significant risk for schizophrenia 
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Rare copy number variants (CNVs) have a prominent role in the 
aetiology of schizophrenia and other neuropsychiatric disorders’. 
Substantial risk for schizophrenia is conferred by large (>500-kilo- 
base) CNVsat several loci, including microdeletions at 1q21.1 (ref. 2), 
3q29 (ref. 3), 15q13.3 (ref. 2) and 22q11.2 (ref. 4) and microduplica- 
tion at 16p11.2 (ref. 5). However, these CNVs collectively account for 
a small fraction (2-4%) of cases, and the relevant genes and neuro- 
biological mechanisms are not well understood. Here we performed a 
large two-stage genome-wide scan of rare CNVs and report the sig- 
nificant association of copy number gains at chromosome 7q36.3 
with schizophrenia. Microduplications with variable breakpoints 
occurred within a 362-kilobase region and were detected in 29 of 
8,290 (0.35%) patients versus 2 of 7,431 (0.03%) controls in the com- 
bined sample. All duplications overlapped or were located within 89 
kilobases upstream of the vasoactive intestinal peptide receptor gene 
VIPR2. VIPR2 transcription and cyclic-AMP signalling were signifi- 
cantly increased in cultured lymphocytes from patients with micro- 
duplications of 7q36.3. These findings implicate altered vasoactive 


intestinal peptide signalling in the pathogenesis of schizophrenia and 
indicate the VPAC2 receptor as a potential target for the development 
of new antipsychotic drugs. 

A majority of the rare CNVs that have been implicated in schizo- 
phrenia involve large (>500 kb) genomic regions where local segmental 
duplication architecture promotes frequent and nearly identical re- 
arrangements by non-allelic homologous recombination (NAHR). 
Because of the high structural mutation rates at these loci, the strong 
phenotypic effects of the causal variants, and the excellent power of 
most array platforms to detect such large CNVs, these genomic hot- 
spots were the first to be detected in studies of CNVs in schizophrenia. 
As most of the genome lacks the duplication architecture of the 
NAHR hotspots described earlier and because a variety of muta- 
tional mechanisms can give rise to structural rearrangements, causal 
variants in other regions of the genome may consist of CNVs that 
are individually rarer and smaller (<500 kb) than those arising at 
NAHR hotspots. For example, microdeletions of the gene neurexin 
1 (NRXN1), which are highly enriched in autism and schizophrenia®’, 


Table 1 | Significant association of four CNV regions with schizophrenia 


Region (hg18) Genes Band Type Primary Secondary Peak Peak Permutation 
Cases Controls Cases Controls OR (Cl) P-value P-value 

chr22: 19786712- BCR 22q11.2 ~~ Del. 2 0 22 ) 14.21* (4.24, infinity) 241075 <5.00 x 10° 

19795854 

chr7: 158731401- VIPR2** 7936.3 Dup. 2 0 8 1 16.41 (3.11, infinity) 839x105 400x10° 

158810016 

chr16: 29569647- 28 genes 16p11.2 = Dup. 4 0 8 1 16.14 (3.06, infinity) 0.000097 0.0001 

30209382 

chr15: 29694064— OTUD7A 15q13.3 Del. 2 0 6 1 14.94 (2.80, infinity) 0.00023 0.00016 

29705665 

chr7: 158448321- VIPR2, BC042556 7936.3 Dup. 2 0 2 ) 8.26* (2.36, infinity) 0.00086 0.0007 

158605936 

chr15: 28881608- MTMR15 15q13.3 Del. 2 0 6 1 14.94 (2.80, infinity) 0.00023 0.001 

28991107 

chr3: 196826549- CR597873,SDHALP2 3q29 Dup. 2 0 8 ) 5.65* (1.56, infinity) 0.01 0.005 

196872080 

chr6: 162835583- PARK2 6q26 Dup. 2 0 6 ) 4.41* (1.17, infinity) 0.03 0.044 

162997592 


Events, ORs and exact conditional (EC) P-values listed here correspond to the peak of association. Empirical P-values for the entire target region were then computed based on permutation of case and control 
labels. The minimal threshold for statistical significance after Bonferroni correction for the 114 loci tested was permutation P< 4.4 x 10° *. When the number of controls in the secondary sample was 0, Haldane 
correction (adding 0.5 to each cell in the table) was applied in order to get a finite OR (*). All genes overlapping with the target region are listed or the closest gene within 100 kb (**). Del., deletion; Dup., duplication. 
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consist of overlapping deletions with non-recurrent breakpoints. 
NRXN1 deletions are not flanked by segmental duplications, and may 
occur by different mutational mechanisms such as non-homologous 
end joining (NHEJ) or DNA-replication-mediated rearrangement. 
To identify novel schizophrenia genes, we investigated copy number 
variation genome-wide using an approach that detects enrichment of 
multiple overlapping rare variants. Regions of interest were defined ina 
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Figure 1 | Detection and validation of microduplications and triplications 
of 7q36.3. a, Map of CNVs detected in the primary and secondary cohorts 
from the UCSC genome browser (http://genome.ucsc.edu). ISC, International 
Schizophrenia Consortium. b, Plots of probe intensity ratios for 16 CNVs 
detected in the primary and MGS data sets. All are cases with the exception of 
two controls who are indicated with an asterisk. Regions with estimated copy 
numbers of 2, 3 and 4 are highlighted in grey, blue and green, respectively. 
Locations of four Sequenom validation assays are shown (dashed lines). 
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Distal region 


primary sample of 802 patients and 742 controls as genomic segments 
containing CNVs in at least two cases and in no controls. This discovery 
step yielded 114 genomic regions of interest. In the secondary cohort of 
7,488 patients and 6,689 controls, we assessed the association of these 
regions with schizophrenia (Supplementary Table 2). All CNVs over- 
lapping each of the 114 regions of interest were collected, and CNV 
breakpoints falling within each region were used to partition the region 
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c-f, CNV genotypes were confirmed by MeZOD cluster plots of probe intensity 
ratios of the proximal and distal regions and in the primary data set (c and 
d, respectively) and secondary data set (e and f, respectively). g, h, Absolute 
copy numbers were confirmed for duplications and triplications of the 
proximal (VIPR2-L and VIPR2-R) and distal (7qter-L and 7qter-R) regions 
(b, dashed lines) by Sequenom MASSarray genotyping. Error bars represent the 
standard deviation of three replicates. 
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into a series of non-overlapping segments or bins (see Supplementary 
Fig. 1). Significance was tested within each bin by the exact conditional 
test, with ethnicity and study as covariates. The segment with the 
minimal P-value was defined as the peak of association within the 
region, and a permutation-based multiple testing correction scheme 
was applied to obtain the P-value for the region. 

Of the 114 regions detected in the first step, four had statistically 
significant associations in the secondary sample after Bonferroni cor- 
rection (« = 0.05/114 = 4.4 X 10 *). Table 1 lists the four regions with 
significant P-values meeting this criterion and an additional four loci 
with nominally significant P-values (P<0.05) in the secondary 
cohort. Regions with significant associations were loss of copy number 
at 22q11.2 (P<5X 10 °, odds ratio (OR) = 14.2), gain at 7q36.3 
(P=40X10°, OR=164), gain at 16pl11.2 (P=1.0X10 +, 
OR = 16.1) and loss at 15q13.3 (P= 1.6 X 10 *, OR= 149). No sig- 
nificant heterogeneity was observed for these genomic regions across 
studies (Breslow-Day-Tarone P = 0.42 — 0.83). 

15q13.3, 16p11.2 and 22q11.2 are well-documented loci conferring 
increased risk for schizophrenia”>*. All are hotspots for NAHR, and all 
alleles contributing to the association consist of large deletions with 
similar breakpoints. By contrast, microduplications at 7q36.3 have not 
been previously implicated in neuropsychiatric disorders. The 7q36.3 
region harboured CNVs that overlapped but differed in size and break- 
point positions (Fig. la). The peak of association is located in the 
subtelomeric region of 7q, upstream of the gene VIPR2. Also, ranking 
fifth among the associations genome-wide was another region, 125 kb 
proximal to the peak at 7q36.3 (P = 0.0007; Table 1). Combining the 
two 7q36.3 regions into a single 362-kb region (chromosome 7 (chr7): 
158448321-158810016), duplications were detected in 29 of 8,290 
(0.35%) patients and 2 of 7,431 (0.03%) controls in this study. The 
P-value for the combined region in the combined sample was 
5.7 X 10” and the odds ratio (OR) (95% confidence interval (CI)) 
was 14.1 (3.5, 123.9). A complete list of 7q36.3 CNVs is provided in 
Supplementary Table 3. 

We examined sensitivity and specificity of CNV calls in the 7q36 
region to determine the possibility of a spurious association (Supplemen- 
tary Note). No additional duplications >100kb were detected after 
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reducing the stringency of our CNV filtering criteria. Next, identical 
CNV calls were obtained using a more sensitive targeted CNV calling 
algorithm, median Z-score outlier detection (MeZOD)’ (Fig. 1c-f). All but 
one of the duplications (control sample 06C52730) were confirmed using 
the Sequenom MASSarray genotyping platform with assays designed for 
the proximal region (Fig. 1g) and for the distal region (Fig. 1h). Validated 
CNVs discovered in the Molecular Genetics of Schizophrenia (MGS) 
study subjects were mapped at higher resolution using the NimbleGen 
HD2 platform, and plots of probe intensity ratios from the HD2 array are 
shown in Fig. 1b and Supplementary Fig. 3. In addition, tandem duplica- 
tions of the VIPR2 gene were confirmed in two patients by fluorescence in 
situ hybridization (FISH) (Supplementary Fig. 4). 

Unexpectedly, manual examination of probe ratios in Fig. 1b 
revealed additional structural complexity within some of the 7q36.3 
CNVs. Copy number profiles in four patients (03C23250, 05C43079, 
03C23091 and 00C02204) indicated that there were triplications 
nested within duplications of the proximal region (Fig. 1b). In all four 
patients, a triplication overlapped with exons 3 and 4 of the gene 
VIPR2. A copy number of four was confirmed in these samples using 
the Sequenome MASSarray CNV assay (Fig. 1g), and results for all 
samples were consistent with results in Fig. 1b. VIPR2 transcripts were 
amplified from messenger RNA samples from the four triplication 
carriers. The normal VIPR2 transcript was detected in all samples, 
and we did not observe a larger product corresponding to a transcript 
with duplicated exons. 

Inheritance of the duplication at 7q36.3 could be evaluated in three 
families (Fig. 2). In family 02-135, the duplication was confirmed in the 
proband, but not detected in either of the unaffected parents, and thus is 
apparently de novo (Fig. 2f). In family 02-016, the duplication was 
detected in the proband and in a mother with a diagnosis of depression 
(Fig. 2d). In family LW102, the duplication was detected in the proband 
and in an unaffected mother. The proband’s mother also had a son with 
a diagnosis of schizophrenia (LW-102-03) from a second marriage, but 
DNA was not available on this individual. Clinical psychiatric reports of 
patients 02-016 and 02-135 are provided in the Supplementary Note. 

Variable expressivity is often characteristic of pathogenic CNVs°. 
We evaluated the spectrum of psychiatric phenotypes associated with 
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Figure 2 | Patterns of CNV inheritance in families. a-f, Pedigree diagrams 
are shown for families LW102 (a), 02-016 (b) and 02-135 (c), along with the 
Sequenom validation for families LW102 (d), 02-016 (e) and 02-135 

(f). Sequenom validation was performed on a mother and one of the affected 


Probe 7qter-L 


Probe VIPR2-L 


sons (a), and all three family members (b), along with 10 CEU HapMap 
controls. Sequenom assays confirmed that duplications (Dup.) were present in 
the patients and maternally inherited from LW102-2 and 02-0016-4. AO, age at 
onset. Error bars represent the standard deviation of three replicates. 
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7q36.3 duplications by screening for these events in individuals with 
bipolar disorder or autism spectrum disorder (ASD). Microarray data 
were available for 2,777 patients from the Bipolar Genome Study 
(BiGS), for 996 ASD patients from the Autism Genome Project 
Consortium (AGP), and from our unpublished analyses of 114 patients 
with ASD using the NimbleGen HD2 platform. Microduplications of 
7q36.3 (>100 kb in size) were detected in 3 of 1,110 (0.27%) of patients 
with ASD; compared with the controls described earlier, P = 0.018. 
Microduplications at 7q36.3 were detected in 2 of 2,777 (0.07%) 
patients with bipolar disorder; compared with the controls, P = 0.23. 
These results provide preliminary evidence that the clinical phenotypes 
associated with 7q36.3 duplications may include paediatric neuro- 
developmental disorders such as autism, but do not include bipolar 
disorder. Also worthy of note, larger chromosomal abnormalities invol- 
ving 7q have been reported in association with neurodevelopmental 
disorders, including deletions of 7q36-7qter®’® and duplications of 
7q35-7qter''; a 550 kb duplication of 7qter (of unknown clinical rel- 
evance) has been reported in a patient with neurofibromatosis’’. 

These genetic data implicate the gene VIPR2. All variants contribu- 
ting to the observed association at 7q36.3 overlap with this gene or lie 
within the geneless subtelomeric region <89 kb from the transcrip- 
tional start site of VIPR2. VIPR2 encodes the vasoactive intestinal 
peptide (VIP) receptor VPAC2, a G-protein-coupled receptor that is 
expressed in a variety of tissues including, in the brain, the suprachias- 
matic nucleus, hippocampus, amygdala and hypothalamus’’. VPAC2 
binds VIP", activates cyclic AMP (cAMP)-signalling and PKA, regu- 
lates synaptic transmission in the hippocampus’*”*, and promotes the 
proliferation of neural progenitor cells in the dentate gyrus'’. Genetic 
studies in mouse have established that VIP signalling has a role in 
learning and memory'®. VPAC2 also has a role in sustaining normal 
circadian oscillations in the suprachiasmatic nucleus’’, and VIPR2- 
null”? and VIPR2-overexpression” mice exhibit abnormal rhythms 
of rest and activity. 

cAMP signalling has been implicated in schizophrenia”. We pro- 
posed that increases in VIPR2 transcription and VPAC2-mediated 
cAMP signalling would be a consequence of the microduplications at 
7q36.3. We thus assessed VIPR2 mRNA and cAMP accumulation in 
response to VIP and a VPAC2-selective agonist (BAY 55-9837) in lym- 
phoblastoid cell lines from eight MGS study subjects: two with subtelo- 
meric duplications, three with duplications of VIPR2, four with partial 
triplications, and four controls with normal copy number of the region 
(see Supplementary Note). VIPR2 transcripts were present at low but 
measurable levels in all cell lines. VIPR2 mRNA levels were significantly 
increased in duplication carriers compared with controls (Fig. 3a). 
Likewise, cAMP responses to VIP and BAY 55-9837 were significantly 
greater in lymphoblastoid lines from carriers as compared to controls 
(Fig. 3b). In contrast, we observed no group difference in cAMP accu- 
mulation in response to a different G-protein-coupled receptor agonist, 
prostaglandin E2, thus confirming that the effect of 7q36 duplications on 
cAMP accumulation is mediated by the VPAC2 receptor. 

The expression patterns that we observe indicate that a variety of 
different genomic duplications can influence the transcription of 
VIPR2. The exact genetic mechanism for this is unclear. Given that 
some risk variants are upstream of the gene and others are complex 
rearrangements that could potentially disrupt the duplicate copy, our 
results cannot be explained simply by an increase in gene dosage. It is 
likely that duplications of 7q36 affect the regulation of VIPR2. Tandem 
duplication of regulatory sequences, for instance, could affect expres- 
sion of the gene. Alternatively, the subtelomeric location of VIPR2 could 
be relevant to the mechanism. Intrinsic regulation of telomere structure 
and function often affects the transcriptional regulation of adjacent 
genes, a phenomenon known as telomere position effect™*”’. If VIPR2 
is under such epigenetic regulation, any large tandem duplication of the 
subtelomeric region could potentially cause the gene to escape repres- 
sion. However, further studies are needed to determine the mechanism 
by which structural variants influence VIPR2 expression. 
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Figure 3 | Duplications and triplications of 7q36.3 result in increased 
VIPR2 transcription and cAMP signalling. a, Quantitative PCR results of 
VIPR2 mRNA from lymphoblastoid cell lines. Two to four subjects were tested 
for each of four genotypes (subtelomeric duplication, VIPR2 duplication, exon 
3/4 triplication and normal diploid copy number as control). Results are 
expressed as the mean fold-change of the sample relative to the mean of control 
samples. b, c, CAMP accumulation was measured in the same cell lines in response 
to VIP (100 nM) and the VPAC2 agonist BAY 55-9837 (100 nM). Results are 
expressed as fold-change over forskolin (Fsk)/IBMX (3-isobutyl-1- 
methylxanthine, a phosphodiesterase inhibitor) alone. d, No significant 
differences were observed in cAMP response to another G-protein-coupled 
receptor agonist, prostaglandin E2 (PGE2, 1 |1M), demonstrating that the effects 
are specific to VPAC2. For subjects, error bars represent standard error of the 
mean computed across replicates. Differences between the groups of nine 
duplication carriers and four controls were tested using unpaired two-tailed t-test. 


In light of the emerging roles of VIPR2 in the brain, our results 
support the hypothesis that the pathogenesis of schizophrenia—in 
some patients—involves the dysregulation of cellular processes such 
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as adult neurogenesis and synaptic transmission and of the correspond- 
ing cognitive processes of learning and memory. Furthermore, in light 
of the brain expression patterns of VIPR2 (ref. 13), our results support 
the involvement of certain brain regions, such as the hippocampus, 
amygdala and the suprachiasmatic nucleus. 

The link between VIPR2 duplications and schizophrenia may have 
significant implications for the development of molecular diagnostics 
and treatments for this disorder. Genetic testing for duplications of the 
7q36 region could enable the early detection of a subtype of patients 
characterized by overexpression of VIPR2. Significant potential also 
exists for the development of therapeutics targeting this receptor. For 
instance, a selective antagonist of the VPAC2 receptor could have 
therapeutic potential in patients who carry duplications of the VIPR2 
region. Peptide derivatives” and small molecules” have been identified 
that are selective VPAC2 inhibitors, and these pharmacological studies 
offer potential leads in the development of new drugs. Although dupli- 
cations of VIPR2 account for a small percentage of patients, the rapidly 
growing list of rare CNVs that are implicated in schizophrenia indicates 
that this psychiatric disorder is, in part, a constellation of multiple rare 
diseases’. This knowledge, along with a growing interest in the develop- 
ment of drugs targeting rare disorders”, provides an avenue for the 
development of new treatments for schizophrenia. 


METHODS SUMMARY 

Cohort description. Our primary cohort consisted of unrelated patients derived 
from family-based studies conducted by investigators at the University of 
Washington, McLean Hospital, Columbia University, Trinity College Dublin, 
New York University and Harvard Medical Schools (Supplementary Table 1). 
All samples were analysed by array comparative genomic hybridization (CGH) 
using the NimbleGen HD2 platform. The secondary cohort consisted of 
Affymetrix SNP Array 6.0 data from the MGS study of schizophrenia”, publicly 
available data from the International Schizophrenia Consortium (ISC), genotyped 
using Affymetrix 6.0 and 5.0 platforms’, and Affymetrix 6.0 data on an independent 
set of controls from the BiGS*° (Supplementary Table 1). 

Intensity data processing and rare CNV calling. With the exception of the 
published CNV calls from the ISC, all data were processed and analysed centrally 
as follows. Microarray intensity data were normalized, and CNV calls were generated 
using an analysis package that we developed called C-score. All CNV call sets were 
filtered in a consistent fashion. To minimize the differential sensitivity of the various 
array platforms to detect CNVs, we limited our analysis to CNVs > 100 kb. This size 
range is readily detectable by all platforms used in this study. The same criteria have 
been previously applied to filter CNVs across studies’. Last, sensitivity to detect large 
(>100 kb) copy number polymorphisms (CNPs) was evaluated at several locations 
in the genome. Overall sensitivity to detect CNVs was comparable in cases and 
controls in both cohorts. Additional details regarding C-score, statistical methods 
and evaluation of CNV call sets are described in the Supplementary Note. 
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Spermatogenesis is one of the most complex and longest processes 
of sequential cell proliferation and differentiation in the body, 
taking more than a month from spermatogonial stem cells, through 
meiosis, to sperm formation’”. The whole process, therefore, has 
never been reproduced in vitro in mammals**, nor in any other 
species with a very few exceptions in some particular types of 
fish®’. Here we show that neonatal mouse testes which contain only 
gonocytes or primitive spermatogonia as germ cells can produce 
spermatids and sperm in vitro with serum-free culture media. 
Spermatogenesis was maintained over 2 months in tissue fragments 
positioned at the gas-liquid interphase. The obtained spermatids 
and sperm resulted in healthy and reproductively competent off- 
spring through microinsemination. In addition, neonatal testis 
tissues were cryopreserved and, after thawing, showed complete 
spermatogenesis in vitro. Our organ culture method could be 
applicable through further refinements to a variety of mammalian 
species, which will serve as a platform for future clinical application 
as well as mechanistic understanding of spermatogenesis. 

Studies on in vitro spermatogenesis date back to organ culture experi- 
ments about a century ago®. In 1937, it was reported that spermatogenesis 
proceeded up to the pachytene stage of meiosis in testis tissues of new- 
born mouse placed on a clot’. In the 1960s, organ culture methods had 
advanced and various conditions were extensively examined. However, it 
was not possible to promote spermatogenesis beyond the pachytene 
stage’®"’. Thereafter, cell culture methods, instead of organ culture, were 
used with new concepts and devices, including immortalized germ cell 
lines’, the production of Sertoli cell lines as feeder cells'’, bicameral 
chamber methods”, etc**. Despite such endeavours, progress has been 
limited, and it is still impossible to produce fertility-proven haploid cells 
from spermatogonial stem cells in vitro’. 

At the outset of our research on in vitro spermatogenesis, we decided 
to re-evaluate organ culture methods first. According to the standard 
gas-liquid interphase method", testis tissue fragments, 1-3 mm in 
diameter, were placed on an agarose gel half-soaked in medium 
(Fig. 1a). To make evaluation simple and easy, we exploited two lines 
of transgenic mice: Gsg2-GFP"* (Gsg2 is also known as Haspin) and 
Acr-GFP’”"’, where the GFP gene is under control of the Gsg2 and Acr 
promoters, respectively. These marker green fluorescent proteins 
(GFPs) specific for meiosis and haploid cells were extremely useful 
for monitoring the progress of spermatogenesis in vitro (Supplemen- 
tary Fig. 1). Then, we devised a grading system for the extension of 
GFP expression to quantify the progress of spermatogenesis in each 
tissue (Supplementary Fig. 2). 

In our previous experiments, the optimal temperature for organ 
culture was 34°C, and the most effective medium was «MEM (or 
RPMI) + 10% FBS” (Supplementary Fig. 3). Among others, FBS 
was indispensible to induce spermatogenesis in the organ culture 
experiments (Fig. 1b). Under these conditions, we found that round 
spermatids, haploid cells, were produced in an organ culture experi- 
ment using 7.5-10.5 days postpartum (dpp) pup mouse testis tissues’”. 


However, we were not able to identify any elongating spermatids or 
sperm. In addition, as meiosis starts around 7 dpp in mice”, the testis 
tissues may have included some spermatocytes from the beginning. 
Therefore, the study is inconclusive regarding whether or not haploid 
cells were produced from spermatogonia. 
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Figure 1 | Effect of FBS and serum replacement on pup testis tissues. 

a, Stereomicroscopic and histological views of testis tissue fragments of 7.5 dpp 
pups. The tissues were placed on agarose gel stands half-soaked in the medium. 
b, Stereomicroscopic and histological views of testis tissue fragments of 7.5 dpp 
cultured with xMEM medium without or with FBS. c, Testis tissue fragments of 
7.5 dpp Gsg2-GFP transgenic mice were grown with RPMI supplemented with 
10% FBS or 10% KSR. d, Five media, RPMI (control), RPMI + 10% FBS, 
RPMI + 10% KSR, RPMI + B27 and TKM, were compared on the basis of the 
extent of Gsg2-GFP expression scored using the grading scale on culture day 20 
(means = s.d., n = 6-8, *P < 0.0001). e, The 10% KSR induced stronger GFP 
expression and maintained the expression for a longer period than 10% FBS 
(mean = s.d.; 1 = 8 and 7 for KSR and FBS, respectively). Scale bars, 50 um 
(a, b); 0.3mm (c). 
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Acr-GFP 
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Figure 2 | Effect of KSR on neonatal testis tissues. a, 10% KSR induced the 
expression of both Acr-GFP and Gsg2-GFP in 2.5 and 0.5 dpp mouse testes, 
respectively. Pictures were taken on culture days 27 and 39, respectively. 

b, Immunostaining with anti-SYCP3 antibody. c, Acr-GFP-expressing cells 
(green) at the pachytene stage were also stained with SYCP1 (red). The merged 
picture includes Hoechst nuclear stain (blue). d, In a cryosection of a Gsg2-GFP 


For further improvements of the culture media, we have tested 
different kinds of factor reported to be effective for promoting sper- 
matogenesis or in the development of immature testes. However, none 
of these factors were able to promote Acr-GFP or Gsg2-GFP expression 
in our pilot studies (Supplementary Fig. 4). These results raised the 
possibility that FBS contains factors which suppress the progress of 
spermatogenesis, thus preventing further refinements of the culture 
conditions. To overcome such a possible limitation of FBS, we per- 
formed culture experiments using KSR”!” and B27 (ref. 23) as serum- 
replacement or serum-free TKM medium™. Whereas B27 and TKM 
did not induce GFP expression, KSR induced the expressions of both 
Acr- and Gsg2-GFP. Surprisingly, the level of GFP expression induced 
by KSR was stronger than that induced by FBS in every experiment 
(Fig. 1c, d). In addition, KSR significantly extended the duration of 
GFP expression in culture (Fig. le). KSR is commonly used in culture 
media for embryonic stem cells to promote their proliferation while 
keeping them in an undifferentiated state’. However, as it has been 
rarely used for organ culture experiments, the present results were 
unexpected. 

Thus, we used KSR on more immature testes of neonates, 0.5-2.5 dpp, 
and found the expression of both Acr-GFP and Gsg2-GFP (Fig. 2a). The 
effect of KSR was evident compared to that of FBS (Supplementary Fig. 
5a, b). In order to confirm that such GFP expression reflected genuine 
meiosis, we examined the expression of meiotic marker proteins, SYCP1 
and SYCP3 (ref. 5), by immunochemistry. The GFP-expressing cultured 
tissues were dissociated and stained with SYCP3, showing representative 
chromosomal spreading in some cells (Fig. 2b). When GFP and SYCP1 
were costained in Acr-GFP-expressing cultured tissues, they were colo- 
calized in the pachytene stage of spermatocytes (Fig. 2c). With Gsg2-GFP 
testis tissues, it was shown that SYCP1 also stained pachytene-stage 
spermatocytes located just on the outer side of Gsg2-GFP-positive cells 
in the seminiferous tubules, confirming that Gsg2-GFP-expressing cells 
were finishing meiosis (Fig. 2d). These results demonstrated that 
authentic meiosis progressed in the testis tissues cultured with KSR. 
The somatic cells, Sertoli and peritubular myoid cells, expressed andro- 
gen receptors (AR), a mediator of testosterone effects that is essential for 
spermatogenesis”, demonstrating their maturity to support spermato- 
genesis under the culture conditions (Fig. 2e). 

Then we set experiments to find haploid cells in the cultured tissues. 
First, we found many spermatids in step 2-8 (refs 2, 18) in cultured 
samples of the Acr-GFP testis after the mechanical dissociation of cells 
into suspension in six out of the seven tissues examined, cultured for 
23-50 days (Fig. 3a and Supplementary Fig. 6). In addition, we observed 
flagellated sperm in 5 out of the 11 samples examined, which were 
cultured for 27-45 days (Fig. 3b, c). These findings were also supported 
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testis tissue, SYCP1 was demonstrated in cells (red) outside the Gsg2-GFP- 
positive cells (green). Hoechst (blue). The boxed area is enlarged in the right 
panel. e, Gsg2-GFP-expressing testis tissue, originating from 2.5 dpp mice and 
cultured for 21 days, was cryosectioned and stained with antibodies against 
GFP, AR, and counterstained with Hoechst dye. Scale bars, 0.5 mm (a); 5 um 


(b, c) and 30 pm (d, e). 
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Figure 3 | Formation of spermatids. a, A GFP-positive acrosomal cap was 
observed, indicating the presence of step 6 spermatids, in samples of dissociated 
testis tissues of Acr-GFP mouse neonates cultured for 27 days. Inset shows 
Hoechst-stained Acr-GFP spermatid. b, In the same sample, flagellated cells 
were observed. c, Sperm with Acr-GFP acrosome was stained with Hoechst. 
d, Gsg2-GFP mouse testis tissues, 1.5 dpp, cultured for 30 days, were subjected 
to flow cytometry. Tissues cultured with KSR included cells of 1N (arrow, 1.71% 
of all cells). The adult testis served as the control. e, f, Thin section of the tissues 
of 2.5 dpp Gsg2-GFP mice, cultured for 38 days, showed well developed 
seminiferous tubules showing spermatogenesis. The white box in e is enlarged 
in f, showing sperm formation (arrows). Scale bars, 5 um (a, b, c) and 50 jm (e). 
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by flow cytometric analysis of dissociated cells from cultured tissues, 
which identified cells showing 1N ploidy as a sign of spermatids, along 
with 2N and 4N cells (Fig. 3d and Supplementary Fig. 7). Meanwhile, 
histological examinations revealed the overall phenomena occurring in 
the cultured tissues (Supplementary Fig. 8). In the peripheral region of 
each tissue piece, spermatogenesis, up to elongating spermatid forma- 
tion, was observed (Fig. 3e, f). In some experiments of extended culture 
period, Gsg2-GFP expression remained at its highest level until around 
a corresponding age of 30-40 dpp, and then gradually decreased, but 
lasted beyond 70 dpp. The formation of sperm was confirmed at both 38 
and 60 days of culture in a single experiment (Supplementary Fig. 9). 
Our organ culture system, therefore, was able to induce and maintain 
spermatogenesis for more than 2 months. 

Finally, we tested the fertility of spermatids and sperm produced in 
vitro by microinsemination. Round spermatids retrieved from tissues, 
originated from 3.5dpp testes cultured for 23 days, were used for 
insemination with the round spermatid injection (ROSI) technique. 
Sperm retrieved from the tissues, originated from 2.5 dpp testes cul- 
tured for 42 days, were used for intracytoplasmic sperm injection 
(ICSI) (Fig. 4a, b). Using 23 and 35 oocytes for ROSI and ICSI, respect- 
ively, 7 and 5 live offspring were delivered (Fig. 4c, Supplementary 
Table 1) and weaned at 3 weeks (Fig. 4d). Although these experiments 
were small in scale and using only a single line of mice, the efficiencies 
of progeny production with the in vitro-produced gametes were com- 
parable to that with in vivo-generated counterparts**. PCR analysis of 
their tail tip DNA identified 4 GFP-carrying offspring out of 12, com- 
patible with cultured testis tissues being heterozygous for Gsg2-GFP 
(Fig. 4e). Their reproductive capacity was examined by brother-sister 
mating, demonstrating that all four males and eight females were 
fertile (Supplementary Table 2). 

We thought that the cryopreservation of testis tissues, if feasible, 
would extend our results to a variety of practical applications. 
Therefore, we froze neonatal testis tissues and placed them in liquid 
nitrogen for 4—25 days. After thawing, they were grown on agarose in 
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Figure 4 | Fertility of sperm and spermatids produced in vitro. a, In the 
microinsemination experiment, elongated spermatids and spermatozoa among 
round spermatids were observed. b, Sperm were retrieved from the testis tissues 
of 2.5 dpp mice after culturing for 42 days. c, Offspring were produced by ICSI 
and ROSI with sperm and round spermatids, respectively. d, A photo of 
offspring at 7 weeks. e, Tail tip DNA analysis by PCR for GFP. C, DNA sample 
taken from a Gsg2-GFP mouse tail tip as a positive control; M, marker. Scale 
bars, 10 tm (a, b) and 1 cm (c). 
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the same manner as for non-frozen samples. GFP expression was 
observed in all four experiments using Acr-GFP mice, and in two 
experiments using Gsg2-GFP mice (Supplementary Fig. 10a). In three 
of the four experiments using Acr-GFP mice, GFP-expressing acro- 
somes were observed in mechanically dissociated samples (Sup- 
plementary Fig. 10b). On the histological examination of cultured 
Gsg2-GFP-expressing tissues, elongated spermatids were observed in 
one out of the five tissues examined (Supplementary Fig. 10c). These 
results demonstrated that testis tissue fragments can be cryopreserved 
and resume full spermatogenesis later in vitro. 

KSR was vital for the success of the present experiments. Elucidating 
the mechanism of its action and identifying the critical factors are 
important for the further refinement of our culture conditions. We 
have tested several factors reportedly contained in KSR”, and found 
that lipid-rich bovine serum albumin (AlbuMAX) was probably the 
most critical component regarding our present results, because the 
addition of AlbuMAX in place of KSR led to almost the same results 
(Supplementary Fig. 11a, c). In addition, it seemed that FBS does not 
contain factors which inhibit the progression of spermatogenesis, 
because medium containing both FBS and AlbuMAX was as effective 
in inducing spermatogenesis as that containing AlbuMAX alone 
(Supplementary Fig. 11b). Further studies to identify key molecule(s) 
in KSR and AlbuMAX are warranted. 

We have demonstrated that the organ culture conditions, without a 
circulatory system as in vivo, can support the complete spermatogenesis 
of mice. Therefore, extending the present results to a wide range of 
species by refinements and the individualization of culture conditions 
to each of them seems promising. Such progress will contribute to the 
elucidation of the molecular mechanisms of spermatogenesis and 
development of new diagnostic and therapeutic techniques for male 
infertility. 


METHODS SUMMARY 


Acr-GFP and Gsg2-GFP transgenic mice were mated with female mice of ICR, 
C57BL/6, or ICR X C57BL/6F1 to produce pups. The pups were used for the 
culture experiments at 0.5 to 11.5 dpp. The testis tissues were place on agarose 
gel half-soaked in the medium. The cultured tissues were observed every 3-7 days 
under a stereomicroscope equipped with an excitation light for GFP to score the 
extent of GFP expression in each tissue. They were also processed for histological 
and immunohistological examinations. For the observation of Acr-GFP acro- 
somes, cultured tissues were mechanically dissociated using needles to release cells 
into PBS. For cryopreservation, fragments of testis tissues were immersed in 
cryoprotectants for several hours or overnight at 4°C, and then placed at 
—70°C overnight before being stored in liquid nitrogen. On the initiation of 
culture, tissues were thawed at room temperature, soaked briefly in the medium, 
and then placed on agarose for culture. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 


Received 8 June 2010; accepted 17 January 2011. 


1. Clermont, Y. Kinetics of spermatogenesis in mammals: seminiferous epithelial 
cycle and spermatogonial renewal. Physiol. Rev. 52, 198-236 (1972). 

2. Russell, L. D., Ettlin, R.A. SinhaHikim, A. P. & Clegg, E. D. in Histological and 
Histopathological Evaluation of the Testis 1-40 (Cache River, 1990). 

3. Staub, C. A century of research on mammalian male germ cell meiotic 
differentiation in vitro. J. Androl. 22, 911-926 (2001). 

4. Parks, J. E., Lee, D.R., Huang, S. & Kaproth, M. T. Prospects for spermatogenesis in 
vitro. Theriogenology 59, 73-86 (2003). 

5. La Salle, S., Sun, F. & Handel, M. A. Isolation and short-term culture of mouse 
spermatocytes for analysis of meiosis. Methods Mol. Biol. 558, 279-297 (2009). 

6. Miura, T., Yamauchi, K., Takahashi, H. & Nagahama, Y. Hormonal induction of all 
stages of spermatogenesis in vitro in the male Japanese eel (Anguilla japonica). 
Proc. Natl Acad. Sci. USA 88, 5774-5778 (1991). 

7. Sakai, N. Transmeiotic differentiation of zebrafish germ cells into functional sperm 
in culture. Development 129, 3359-3365 (2002). 

8. Champy, C. Quelques résultats de la méthode de culture des tissues. Arch. Zool. 
Exp. Gen. 60, 461-500 (1920). 

9. Martinovitch, P.N. Development in vitro of the mammalian gonad. Nature 139, 413 
(1937). 


©2011 Macmillan Publishers Limited. All rights reserved 


23. 
24. 


. Steinberger, A., Steinberger, E. & Perloff, W. H. Mammalian testes in organ culture. 


Exp. Cell Res. 36, 19-27 (1964). 


. Steinberger, A. & Steinberger, E. Factors affecting spermatogenesis in organ 


cultures of mammalian testes. J. Reprod. Fertil, Suppl. 2, 117-124 (1967). 


. Feng, L.X. etal. Generation and in vitro differentiation of a spermatogonial cell line. 


Science 297, 392-395 (2002). 


. Rassoulzadegan, M. etal. Transmeiotic differentiation of male germ cells in culture. 


Cell 75, 997-1006 (1993). 


. Staub, C. et al. The whole meiotic process can occur in vitro in untransformed rat 


spermatogenic cells. Exp. Cell Res. 260, 85-95 (2000). 


. Trowell,O.A. The culture of mature organs ina synthetic medium. Exp. Cell Res. 16, 


118-147 (1959). 


. Tanaka, H. et a/. Identification and characterization of a haploid germ cell-specific 


nuclear protein kinase (Haspin) in spermatid nuclei and its effects on somatic 
cells. J. Biol. Chem. 274, 17049-17057 (1999). 


. Nakanishi, T. et a/. Real-time observation of acrosomal dispersal from mouse 


sperm using GFP as a marker protein. FEBS Lett. 449, 277-283 (1999). 


. Ventela, S. et al. Regulation of acrosome formation in mice expressing green 


fluorescent protein as a marker. Tissue Cell 32, 501-507 (2000). 


. Gohbara, A. et al. In vitro murine spermatogenesis in an organ culture system. Biol. 


Reprod. 83, 261-267 (2010). 


. Goetz, P., Chandley, A. C. & Speed, R. M. Morphological and temporal sequence of 


meiotic prophase development at puberty in the male mouse. J. Cell Sci. 65, 
249-263 (1984). 


. Goldsborough, M. D. etal. Serum-free culture of murine embryonic stem (ES) cells. 


Focus 20, 8-12 (1998). 


. Price, P. J., Goldsborough, M. D. & Tilkins, M. L. Embryonic stem cell serum 


replacement. PCT/US98/00467: WO 98/30679 (1998). 

Wachs, F. P. etal. High efficacy of clonal growth and expansion of adult neural stem 
cells. Lab. Invest. 83, 949-962 (2003). 

Tres, L.L. & Kierszenbaum, A. L. Viability of rat spermatogenic cells in vitro is 
facilitated by their coculture with Sertoli cells in serum-free hormone- 
supplemented medium. Proc. Nat! Acad. Sci. USA 80, 3377-3381 (1983). 


LETTER 


25. Sharpe, R. M. in Sertoli Cell Biology (eds Skinner, M. K. & Griswold, M. D.) 199-216 
(Elsevier Academic Press, 2006). 

26. Ogonuki, N. et a/. The effect on intracytoplasmic sperm injection outcome of 
genotype, male germ cell stage and freeze-thawing in mice. PLoS ONE 5, e11062 
(2010). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements We thank S. Yoshida and G. Yoshizaki for comments and 
pre-submission review. We thank A. Tanaka and Y. W. Zheng for their technical help in 
flow cytometric analysis. T.O. is grateful to his mentor R. L. Brinster for his advice on 
devising the experimental strategies and for his encouragements. We would like to 
thank A. Steinberger and the late E. Steinberger whose work in the study of in vitro 
spermatogenesis became the ground work for our present study. This work was 
supported by a Grant-in-Aid for Scientific Research on Innovative Areas, ‘‘Regulatory 
echanism of Gamete Stem Cells” (420116005); a Grant-in-Aid for Scientific 
Research (C) (#21592080) from the Ministry of Education, Culture, Sports, Science, 
and Technology, Japan; a grant from the Yokohama Foundation for Advancement of 
edical Science; and a grant for Research and Development Project Il (No.S2116) of 
Yokohama City University, Japan (to T.O.). 


Author Contributions T.S. performed the experiments, interpreted the results, and 
prepared the manuscript. K.K. performed all culture experiments. A.G. contributed to 
the culture experiments. K.I. and N.O. performed microinsemination experiments. A.O. 
performed microinsemination experiments and discussed the results. Y.K. supervised 
the project and discussed the results. T.0. designed and performed the experiments 
and wrote the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of this article at 
www.nature.com/nature. Correspondence and requests for materials should be 
addressed to T.O. (ogawa@med.yokohama-cu.ac.jp). 


24 MARCH 2011 | VOL 471 | NATURE | 507 


©2011 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 

Animals. Acr-GFP and Gsg2-GFP transgenic mice were provided by RIKEN BRC 
through the National Bio-Resource Project of MEXT, Japan. Female mice of ICR, 
C57BL/6 (CLEA Japan), or ICR X C57BL/6F1 were mated with a sire of the 
transgenic mice to produce pups. The pups were used for the culture experiments 
at 0.5 to 11.5 days post-partum (dpp). All animal experiments conformed to the 
Guide for the Care and Use of Laboratory Animals and were approved by the 
Institutional Committee of Laboratory Animal Experimentation (Animal 
Research Center of Yokohama City University, Yokohama, Japan). 

Culture media and reagents. The culture media used were o-Minimum Essential 
Medium (0-MEM) (Invitrogen), Roswell Park Memorial Institute 1640 (RPMI) 
(Invitrogen), DMEM (Dulbecco’s modified Eagle’s medium) (Invitrogen), and 
F-12 nutrient mixture (Ham) (Invitrogen). Serum and serum replacements used 
were fetal bovine serum (FBS) for embryonic stem cell (Gibco Invitrogen), 
KnockOut Serum Replacement (KSR) (Invitrogen), B-27 supplement (Gibco 
Invitrogen), and AlbuMAX (Invitrogen). Factors below were added to the culture 
media as indicated in the text. Hepatocyte growth factor (HGF) (5 ng ml~ ') (R&D 
Systems), Activin A (100 ng ml~ ') (Sigma-Aldrich), follicle stimulating hormone 
(FSH) from human pituitary (200 ng ml!) (Sigma-Aldrich), testosterone (1 1M) 
(Wako Pure Chemical Industries), recombinant human BMP-4 (20ng ml!) 
(R&D Systems), recombinant human BMP-7 (20 ng ml!) (R&D Systems), and 
bovine pituitary extract (50 1g ml ') (Invitrogen). 

Culture method. The testes of the neonatal or pup mice were decapsulated and 
gently separated by forceps into 1 to 8 pieces of 1-3 mm in diameter. The tissue 
fragments were then positioned on stands made of agarose gel placed in culture 
plate wells. To make the agarose gel stand, agarose-1 (Dojindo Molecular 
Technologies) was heated to dissolve it in distilled water (1.5% (w/v)) and then 
poured into a 10-cm dish. After cooling, the gels were cut into hexahedrons of 
about 10 X 10 X 5 mm in size. They were then soaked in the culture medium for 
more than 24h to replace the water in them with the medium. Three to four pieces 
of the agarose gels were placed in the wells of a 6-well plate (Sumitomo Bakelite). 
Each gel was loaded with one to three testis tissue fragments. The amount of 
medium was adjusted so it would come up to half to four fifth of the height of 
the agarose gel. Medium change was performed once a week. The culture incubator 
was supplied with 5% carbon dioxide in air and maintained at 34 °C. 
Observations. The cultured tissues were observed every 3 to 7days under a 
stereomicroscope equipped with an excitation light for GFP (Olympus SZX12; 
Olympus) to score the level of GFP expression of the tissues. For histological 
examination, the specimens were fixed with Bouin’s fixative and embedded in 
paraffin. One section showing the largest cut surface was made for each specimen 
and stained with haematoxylin and eosin (H&E). For immunofluorescence stain- 
ing, tissues fixed with 4% paraformaldehyde in PBS were cryo-embedded in OCT 
compound (Sakura Finetechnical) and cut into 7-|tm-thick sections. The first anti- 
bodies to be used were rabbit anti-androgen receptor (AR) antibody (1:500; Santa 
Cruz Biotechnology), rabbit anti-SYCP1 antibody (1:600; Novus Biologicals), and 
rabbit anti-GFP Alexa Fluor 488 conjugate (1:50; Invitrogen). Alexa Fluor 546- 
conjugated goat anti-rabbit IgG (1:400; Invitrogen) was used as a second antibody 
for anti-AR and anti-SYCP1 antibodies. Nuclei were counterstained with Hoechst 


33342 dye. Specimens were observed with a microscope (Nikon Eclipse TE200) or 
confocal laser microscope (Olympus FV-1000D). For the detection of SYCP3, 
cultured tissues were mechanically dissociated with fine forceps, then fixed with 
4% paraformaldehyde in PBS, and stained with rabbit anti-SYCP3 antibody (1:400; 
Abcam) followed by Alexa Fluor 546-conjugated goat anti-rabbit IgG as a second 
antibody. For the observation of Acr-GFP acrosomes, cultured tissues were mech- 
anically dissociated using needles to release cells into PBS. The cell suspension was 
observed with a microscope under GFP excitation light. 

Flow cytometric analysis. The cultured testis tissue fragments were treated with 
2mgml | of collagenase for 15min, followed by 0.25% trypsin/ImM EDTA 
digestion for 10 min at 37°C to dissociate cells. After passing through a cell 
strainer with a 40 um pore size (Becton Dickinson), the cells were suspended in 
PBS containing 3% fetal bovine serum (FBS) and subjected to flow cytometry to 
analyse GFP-expressing cells using a MoFlo sorter (Dako Cytomation). For DNA 
ploidy, the singly-dissociated cells were fixed in 1% paraformaldehyde at 4°C for 
15 min, followed by 70% ethanol at —25 °C for 12-24h, and re-suspended and 
incubated in staining solution (0.1% Triton X-100 in PBS, 0.2 mg ml | RNase A, 
0.02 mg ml ' propidium iodide) at 37 °C for 15 min. The flow cytometric analysis 
was also performed with the MoFlo sorter. 

Microinsemination. The cultured testes tissues were dissected out under a stereo- 
microscope. Round spermatids or spermatozoa were collected and injected into 
the ooplasm of wild-type matured oocytes of B6D2F1 using a Piezo-driven micro- 
manipulator. For fertilization with round spermatids, oocytes were then activated 
by treatment with SrCl, in the presence of cytochalasin B to resume meiosis. After 
the formation of two female pronuclei, one was removed with a micro-pipette”’. 
Fertilized oocytes were cultured for 24h, and two-cell embryos were transferred 
into the oviducts of pseudopregnant ICR females. Live fetuses retrieved on day 
19.5 were raised by lactating foster ICR dams. 

PCR analysis. Genomic DNA was extracted from the mouse tail with a DNeasy 
Tissue kit (Qiagen). The DNA samples (10 ng) were added to a 20-ul reaction 
mixture containing 0.25 UM of each enhanced green fluorescent protein (EGFP)- 
specific primer and Premix Ex Taq (Takara Bio). EGFP-specific primers were 
5'-TACGGCAAGCTGACCCTGAA-3’ and 5’-TGTGATCGCGCTTCTCGTTG-3’. 
The reaction profile was 30 cycles of denaturation at 95°C for 60s, annealing at 
60 °C for 30s, and extension at 72 °C for 60s. 

Cryopreservation of testis tissues. The fragments of testis tissues, prepared in 
exactly the same way as for culture, were immersed in TC-protector cell-freezing 
medium (BUF050, AbD Serotec) for several hours to overnight, and then frozen at 
—70 °C overnight before being placed in liquid nitrogen. The tissues were stored in 
liquid nitrogen for 4-25 days. On the initiation of culture, cryotubes were placed at 
room temperature to thaw the cryoprotectant solution and tissues were taken out 
to be soaked briefly in the culture medium to remove the cryoprotectants. Then, 
they were placed on agarose for culturing. 

Statistical analysis. One-way analysis of variance (ANOVA) was used to compare 
differences between groups. Values with P < 0.05 were considered significant. 
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Fat cells reactivate quiescent neuroblasts via TOR and 
glial insulin relays in Drosophila 


Rita Sousa-Nunes!, Lih Ling Yee! & Alex P. Gould! 


Many stem, progenitor and cancer cells undergo periods of mitotic 
quiescence from which they can be reactivated’. The signals trig- 
gering entry into and exit from this reversible dormant state are not 
well understood. In the developing Drosophila central nervous sys- 
tem, multipotent self-renewing progenitors called neuroblasts*? 
undergo quiescence in a stereotypical spatiotemporal pattern’. 
Entry into quiescence is regulated by Hox proteins and an internal 
neuroblast timer'’'’. Exit from quiescence (reactivation) is subject 
to a nutritional checkpoint requiring dietary amino acids*. Organ 
co-cultures also implicate an unidentified signal from an adipose/ 
hepatic-like tissue called the fat body’*. Here we provide in vivo 
evidence that Slimfast amino-acid sensing and Target of rapamycin 
(TOR) signalling” activate a fat-body-derived signal (FDS) required 
for neuroblast reactivation. Downstream of this signal, Insulin-like 
receptor signalling and the Phosphatidylinositol 3-kinase (PI3K)/ 
TOR network are required in neuroblasts for exit from quiescence. 
We demonstrate that nutritionally regulated glial cells provide the 
source of Insulin-like peptides (ILPs) relevant for timely neuroblast 
reactivation but not for overall larval growth. Conversely, ILPs 
secreted into the haemolymph by median neurosecretory cells sys- 
temically control organismal size'*"* but do not reactivate neuro- 
blasts. Drosophila thus contains two segregated ILP pools, one 
regulating proliferation within the central nervous system and the 
other controlling tissue growth systemically. Our findings support a 
model in which amino acids trigger the cell cycle re-entry of neural 
progenitors via a fat-body-glia—-neuroblasts relay. This mechanism 
indicates that dietary nutrients and remote organs, as well as local 
niches, are key regulators of transitions in stem-cell behaviour. 

In fed larvae, Drosophila neuroblasts (Fig. 1a) exit quiescence from 
the late first instar (L1) stage onwards. This reactivation involves cell 
enlargement and entry into S phase, monitored in this study using the 
thymidine analogue 5-ethynyl-2’-deoxyuridine (EdU). Consistent 
with a previous study”®, we observed that reactivated neuroblast 
lineages (neuroblasts and their progeny; Fig. 1b) reproducibly incor- 
porated EdU in a characteristic spatiotemporal sequence: central brain 
— thoracic abdominal neuromeres (Fig. 1c and Supplementary Fig. 
1). Mushroom-body neuroblasts and one ventrolateral neuroblast, 
however, are known not to undergo quiescence and to continue divid- 
ing for several days in the absence of dietary amino acids"* (Fig. la, c, f). 
This indicates that dietary amino acids are more than mere ‘fuel’, 
providing a specific signal that reactivates neuroblasts. However, 
explanted central nervous systems (CNSs) incubated with amino acids 
do not undergo neuroblast reactivation unless co-cultured with fat 
bodies from larvae raised on a diet containing amino acids’*. We there- 
fore tested the in vivo requirement for a fat-body-derived signal (FDS) 
in neuroblast reactivation by blocking vesicular trafficking and thus 
signalling from this organ using a dominant-negative Shibire dynamin 
(SHIP). This strongly reduced neuroblast EdU incorporation, indi- 
cating that exit from quiescence in vivo requires an FDS (Fig. 1d, e). One 
candidate we tested was IIp6, known to be expressed by the fat body’””®, 
but neither fat-body-specific overexpression nor RNA interference of 
this gene significantly affected neuroblast reactivation (Supplementary 


Table 1 and data not shown). Fat-body cells are known to sense amino 
acids via the cationic amino-acid transporter Slimfast (SLIF), which 
activates the TOR signalling pathway, in turn leading to the production 
of a systemic growth signal’*?!. We found that fat-body-specific over- 
expression of the TOR activator Ras homologue enriched in brain 
(RHEB), or of an activated form of the p110 PI3K catalytic subunit, 
or of the p60 adaptor subunit, had no significant effect on neuroblast 
reactivation in fed animals or in larvae raised on a nutrient-restricted 
diet lacking amino acids (Fig. le, f and data not shown). In contrast, 
global inactivation of Tor, fat-body-specific Slif knockdown or fat- 
body-specific expression of the TOR inhibitors Tuberous sclerosis 
complex 1 and 2 (Tsc1/2) all strongly reduced neuroblasts from exiting 
quiescence (Fig. 1d, e). Together, these results show that a SLIF/TOR- 
dependent FDS is required for neuroblasts to exit quiescence and that 
this may be equivalent to the FDS known to regulate larval growth. 

Next we investigated the signalling pathways essential within neu- 
roblasts for their reactivation. Nutrient-dependent growth is regulated 
in many species by the interconnected TOR and PI3K pathways”? * 
(Supplementary Fig. 2). In fed larvae, we found that neuroblast inac- 
tivation of TOR signalling (by overexpression of TSC1/2), or PI3K 
signalling (by overexpression of p60, the Phosphatase and tensin 
homologue PTEN, the Forkhead box subgroup O transcription factor 
FOXO or dominant-negative p110), all inhibited reactivation (Fig. le). 
Conversely, stimulation of neuroblast TOR signalling (by overexpres- 
sion of RHEB) or PI3K signalling (by overexpression of activated 
p110 or Phosphoinositide-dependent kinase 1 (PDK1)) triggered pre- 
cocious exit from quiescence (Fig. le). RHEB overexpression had a 
particularly early effect, preventing some neuroblasts from undergoing 
quiescence even in newly hatched larvae (Supplementary Fig. 3). 
Hence, TOR/PI3K signalling in neuroblasts is required to trigger their 
timely exit from quiescence. Importantly, neuroblast overexpression of 
RHEB or activated p110 in nutrient-restricted larvae, which lack FDS 
activity'*, was sufficient to bypass the block to neuroblast reactivation 
(Fig. 1f). Notably, both genetic manipulations were even sufficient to 
reactivate neuroblasts in explanted CNSs, cultured without fat body or 
any other tissue (Fig. 1g). Together with the previous results this indi- 
cates that neuroblast TOR/PI3K signalling lies downstream of the 
amino-acid-dependent FDS during exit from quiescence. 

To identify the mechanism bridging the FDS with neuroblast TOR/ 
PI3K signalling, we tested the role of the Insulin-like receptor (InR) in 
neuroblasts (Supplementary Fig. 2). Importantly, a dominant-negative 
InR inhibited neuroblast reactivation, whereas an activated form stimu- 
lated premature exit from quiescence (Fig. le). Furthermore, InR activa- 
tion was sufficient to bypass the nutrient restriction block to neuroblast 
reactivation (Fig. 1f). This indicates that at least one of the potential InR 
ligands, the seven ILPs, may be the neuroblast reactivating signal(s). By 
testing various combinations of targeted I/p null alleles” and genomic Ilp 
deficiencies**”*, we found that neuroblast reactivation was moderately 
delayed in larvae deficient for both I/p2 and Ilp3 (Df(3L)IIp2-3) or lack- 
ing Iip6 activity (Fig. 2a). Stronger delays, as severe as those observed in 
InR”’ mutants, were observed in larvae simultaneously lacking the activ- 
ities of Ilp2, 3 and 5 (Df(3L)Ilp2-3, Ilp5) or Ilp1-5 (Df(3L)Ilp1-5) 
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Figure 1 | TOR/PI3K signalling in fat body and neuroblasts regulates 
reactivation. a, Diagram depicting larval fat body (FB) and CNS with central 
brain (CB), thoracic (Th) and abdominal (Ab) neuromeres, mNSCs, 
mushroom-body neuroblasts (MB NBs) and other neuroblasts (circles) 
indicated. b, Brain lobe (inset in Fig. 1a), showing EdU incorporation in 
postembryonic neuroblasts (large cells; for example, dotted circle) and their 
progeny (smaller cells), labelled with nab-GAL4 driving membrane GFP 


(Fig. 2a). Despite the developmental delay in Df(3L)Ilp1-5 homozy- 
gotes*>*®, neuroblast reactivation eventually begins in the normal spatial 
pattern—albeit heterochronically—in larvae with L3 morphology 
(Fig. 2b, compare timeline with Fig. 1c). Together, the genetic analysis 
shows that Ilp2, 3, 5 and 6 regulate the timing but not the spatial pattern 
of neuroblast exit from quiescence. However, as removal of some ILPs 
can induce compensatory regulation of others”, the relative importance 
of each cannot be assessed from loss-of-function studies alone. 

Brain median neurosecretory cells (mNSCs) (Fig. 1a) are an import- 
ant source of ILPs, secreted into the haemolymph in an FDS-dependent 
manner to regulate larval growth’***". They express Ilp1, 2, 3 and 5, 
although not all during the same development stages'*'*. However, we 
found that none of the seven ILPs could reactivate neuroblasts during 
nutrient restriction when overexpressed in mNSCs (Supplementary 
Table 2). Similarly, increasing mNSC secretion using the NaChBac 
sodium channel” or altering mNSC size using PI3K inhibitors/activa- 
tors, which in turn alters body growth, did not significantly affect 
neuroblast reactivation under fed conditions (Fig. 2a, c, Supplemen- 
tary Fig. 1b and L. Y. Cheng and colleagues, manuscript submitted). 
Surprisingly, therefore, mNSCs are not the relevant ILP source for 
neuroblast reactivation. Nonetheless, Ijp3 and Ilp6 messenger RNAs 
were detected in the CNS cortex, at the early L2 stage, in a domain 


3-day CNS explant cultures 


(Neuroblasts > mGEFP). c, EdU incorporation time course from first-instar (L1) 
to third-instar (L3) larval stages in the wild-type (WT) CNS. OL, optic lobe. 
d, f, g, EdU-labelled CNSs from larvae expressing TOR/PI3K components 
driven by Cg-GAL4 (Fat body >) or nab-GAL4 (Neuroblasts >). e, Histograms 
of EdU~ voxels from thoracic CNSs of fed larvae, normalized to controls. In 
this and all subsequent figures, error bars are s.e.m.; *P < 0.05. See text, 
Methods and Supplementary Fig. 2 for details of molecules expressed. 


distinct from the Ijp2* mNSCs (Supplementary Fig. 4). Two different 
Ilp3-lacZ transgenes’’ indicate that Ilp3 is expressed in some glia 
(Repo* cells) and neurons (Elav* cells). An Ilp6-GAL4 insertion (see 
Methods) indicates that Ilp6 is also expressed in glia, including the 
cortex glia surrounding neuroblasts and the glia of the blood-brain 
barrier (BBB) (Fig. 3a). 

We next assessed the ability of each of the seven ILPs to reactivate 
neuroblasts when overexpressed in glia or in neurons (Supplementary 
Table 2). Pan-glial or pan-neuronal overexpression of ILP4, 5 or 6 led 
to precocious reactivation under fed conditions (Fig. 3b, c). Each of 
these manipulations also bypassed the nutrient restriction block to 
neuroblast reactivation, as did overexpression of ILP2 in glia or in 
neurons, or ILP3 in neurons (Fig. 3d and Supplementary Table 2). 
In all of these ILP overexpressions, and even when ILP6 was expressed 
in the posterior Ultrabithorax domain (Fig. 3e), the temporal rather 
than the spatial pattern of reactivation was affected. Importantly, 
experiments blocking cell signalling with SHIP indicate that glia 
rather than neurons are critical for neuroblast reactivation (Fig. 4a, 
b). Interestingly, glial-specific overexpression of ILP3-6 did not signifi- 
cantly alter larval mass (Fig. 2c). Thus, in contrast to mNSC-derived 
ILPs, glial-derived ILPs promote CNS growth without affecting body 
growth. 
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Figure 2 | Insulin-like peptides but not mNSCs control neuroblast 
reactivation. a, EdU-labelled CNSs from various I/p or InR mutants show 
decreased reactivation whereas larvae with I/p2-GAL4 driving UAS-p60 
(mNSCs > p60) do not. b, EdU incorporation time course in the CNS of 
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Figure 3 | CNS-specific Insulin-like peptides are sufficient for neuroblast 
reactivation. a, Panels show expression of I/p3-nLacZ in subsets of neurons 
(XD311-11) and glia (XD311-1) and Ilp6-GAL4 (Ilp6 > nGFP and 

Ilp6 > mGEP) in glia, including BBB and cortex glia. b, d, EdU-labelled CNSs 
from larvae overexpressing I/p genes in various cell types (see Methods for 
GAL4 drivers used). c, Histograms of normalized EdU* voxels in the thoracic 
CNS for the genotypes in b. e, I/p6 overexpression in the Ultrabithorax domain 
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significantly altered by Ilp2-GAL4 (mNSCs >) driving PI3K signalling 
components but not by repo-GAL4 (Glia >) driving Ilp genes. 
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(Ubx > Ilp6) reactivates neuroblasts in the normal spatial pattern during 
nutrient restriction (NR; left panel). Quiescent/enlarging neuroblasts in the 
central brain, far from the posterior Ubx domain (middle panels), extend 
cytoplasmic processes (arrowheads) towards the neuropil, close to long Ubx* 
cell processes (right panel). The range of Ilp6 activity is difficult to determine 
from this experiment. Neurons, glia and neuroblasts are marked by Elav, Repo 
and Miranda respectively. 
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Figure 4 | IJp6-expressing glia are nutritionally regulated. a, EdU-labelled 
CNSs from larvae expressing the components indicated. b, Histograms of 
normalized EdU* voxels in the thoracic CNS for genotypes in a. ¢, IIp6 > nGFP 
expression in the CNS and fat body of fed versus nutrient-restricted (NR) 
larvae. d, Relay model for amino-acid-dependent fat body regulation of CNS 
and body growth. CNS-restricted (green) and systemic (purple) pools of 
Insulin-like peptides (ILPs) are functionally segregated. Direct amino-acid 
sensing by glia and neuroblasts may contribute to neuroblast reactivation 
(dashed arrows). See text for details. 


Focusing on ILP6, we used CNS explant cultures to demonstrate 
directly that glial overexpression was sufficient to substitute for the 
FDS during neuroblast exit from quiescence (Fig. 3d). In vivo, ILP6 was 
sufficient to induce reactivation during nutrient restriction when over- 
expressed via its own promoter or specifically in cortex glia but not in 
the subperineurial BBB glia, nor in many other CNS cells that we tested 
(Fig. 3d and Supplementary Table 1). Hence, cortex glia possess the 
appropriate processing machinery and/or location to deliver reactivat- 
ing ILP6 to neuroblasts. I/p6 mRNA is known to be up- rather than 
downregulated in the larval fat body during starvation” and, accord- 
ingly, Ilp6-GAL4 activity is increased in this tissue after nutrient 
restriction (Fig. 4c). Conversely, we found that Ilp6-GAL4 is strongly 
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downregulated in CNS glia during nutrient restriction (Fig. 4c). Thus, 
dietary nutrients stimulate glia to express [lp6 at the transcriptional 
level. Consistent with this, an important transducer of nutrient signals, 
the TOR/PI3K network, is necessary and sufficient in glia (but not in 
neurons) for neuroblast reactivation (Fig. 4a, b). Together, the genetic 
and expression analyses indicate that nutritionally regulated glia relay 
the FDS to quiescent neuroblasts via ILPs. 

This study used an integrative physiology approach to identify the 
relay mechanism regulating a nutritional checkpoint in neural pro- 
genitors. A central feature of the fat-body-glia—neuroblasts relay 
model is that glial insulin signalling bridges the amino-acid/TOR- 
dependent FDS with InR/PI3K/TOR signalling in neuroblasts 
(Fig. 4d). The importance of glial ILP signalling during neuroblast 
reactivation is also underscored by an independent study, published 
while this work was under revision’. As TOR signalling is also 
required in neuroblasts and glia, direct amino-acid sensing by these 
cell types may also impinge upon the linear tissue relay. This would 
then constitute a feed-forward persistence detector’’, ensuring that 
neuroblasts exit quiescence only if high amino-acid levels are sustained 
rather than transient. We also showed that the CNS ‘compartment’ in 
which glial ILPs promote growth is functionally isolated, perhaps by 
the BBB, from the systemic compartment where mNSC ILPs regulate 
the growth of other tissues. The existence of two functionally separate 
ILP pools may explain why bovine insulin cannot reactivate neuro- 
blasts in CNS organ culture", despite being able to activate Drosophila 
InR in vitro”. Given that insulin/PI3K/TOR signalling components 
are highly conserved between insects and vertebrates, it will be import- 
ant to address whether mammalian adipose or hepatic tissues signal to 
glia and whether or not this involves an insulin/IGF relay to CNS 
progenitors. In this regard, it is intriguing that brain-specific over- 
expression of IGF1 can stimulate cell-cycle re-entry of mammalian 
cortical neural progenitors”’, indicating utilization of at least part of 
the mechanism identified here in Drosophila. 


METHODS SUMMARY 


For GAL4/UAS experiments, Drosophila were raised at 29°C unless otherwise 
stated. Larvae hatching within a 2h window were transferred to cornmeal food 
(5.9% glucose, 6.6% cornmeal, 1.2% baker’s yeast, 0.7% agar in water) or nutrient- 
restricted medium (5% sucrose, 1% agar in PBS) and further synchronized by 
selecting L2 larvae morphologically from an L1/L2 moulting population. For EdU 
experiments, dissected CNSs were incubated for 1 h in 10 uM EdU/PBS, fixed for 
15 min in 4% formaldehyde/PBS and Alexa Fluor azide was detected according to 
instructions (Click-iT EdU Imaging Kit, Invitrogen). CNS explants were cultured 
on 8-11m pore-size inserts in Schneider's medium, 10% fetal calf serum, 2 mM 
L-glutamine (Gibco) and 1X Pen Strep (Gibco) in 24-well Transwell plates 
(Costar) in a humidified chamber at 25 °C. For EdU quantifications, the ‘thoracic 
region used corresponds to the ventral nerve cord from the level of the brain lobes 
down to Al/A2. EdU~ voxels were quantified using Volocity (Improvision) from an 
average of ten CNSs per experimental genotype, normalized to controls processed in 
parallel (siblings or half-siblings), using Leica SP5 scans (LAS AF software) with a 
1.5-um-step z-series. For larval mass measurements, triplicates of ~50 wandering 
L3 male larvae per genotype were transferred to pre-weighed microfuge tubes and 
wet weights determined using a Precisa XB 120A balance. For all histograms, error 
bars represent the s.e.m. and P values are from two-tailed Student’s t-tests with equal 
sample variance. Further details can be found in Methods. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Rearing and staging of Drosophila larvae. To assist larval genotyping, lethal 
chromosomes were re-established over Dfd- YFP balancers. For EdU experiments, 
crosses were performed in cages with grape-juice plates (25% (v/v) grape-juice, 
1.25% (w/v) sucrose, 2.5% (w/v) agar) supplemented with live yeast paste. For 
GAL4/UAS experiments, larvae hatched within a 2-h time window were trans- 
ferred to our standard cornmeal food (5.9% w/v glucose, 6.6% cornmeal, 1.2% 
baker’s yeast, 0.7% agar in water) or to nutrient-restricted (NR) medium (5% 
sucrose, 1% agar in PBS). Drosophila were raised at 29°C throughout, with the 
following exceptions owing to lethality at high temperature: tub-GAL80*, 
repo > shi?’ and tub-GAL80*, repo > Ilp2 Drosophila were raised at 25 °C during 
embryogenesis and 29°C during larval development, other I/p2 overexpressions 
were performed at 25 °C throughout. At the time of dissection, development was 
further synchronized by selecting L2 animals morphologically from a mixed L1/L2 
moulting population. Df(3L)Ilp2-3, Ilp5° and Df(3L)IIp1-5 homozygotes develop 
considerably slower than controls, so EdU-incorporation experiments used mor- 
phological staging after the L1/L2 and L2/L3 moults. Co-expression of Dcr-2 was 
used to enhance knockdown efficiency for the slif*” allele, in which antisense slif 
sequences are under UAS control from an EP element insertion. As absolute 
numbers of reactivated neuroblasts can vary with small differences in temperature 
and humidity, parallel control experiments were carried out for each genetic 
background (using the siblings or half-siblings of experimental animals), rather 
than using a single control. 

Drosophila strains. Stocks used in this study were: Tor”? (ref. 31), INR® (ref. 32), 
Df(3L)Ilp1-5, Df(X)Ilp6 and Df(X)IIp7 (ref. 33), Ilp1", Ilp2’, Ip3", Ilp4’, Ilp5", lps’, 
Df(X)IIpo"', Df(X)IIpo®*, Ilp7', Df(3L)Ilp2-3, Df(3L)IIp1-4 (ref. 34), slif*N™ (ref. 
35), UAS-Tscl/2 (ref. 36), UAS-Rheb, UAS-InRP' = UAS-InRK'#4, UAS- 
TnRAC = UAS-InR41°?°, BB driver = Cg-GAL4 (ref. 37), pan-glial driver = 
repo-GAL4 (ref. 38), OK107-GAL4 (ref. 39), eg-GAL4 (ref. 40), DopR-GAL4 
(ref. 41), btl-GAL4 (ref. 42), UAS-CD8::GFP, FRT82B, tub-GAL80", Sco/CyO, 
Dfd-YFP and Dr/TM6B, Sb, Dfd-YFP (Bloomington Drosophila Stock Center), 
UAS-p110°N = UAS-p110°78°°° and UAS-p110*“T = UAS p110** (ref. 43), 
UAS-Pten (ref. 44), UAS-p60 (ref. 45), UAS-Pdk*°? = UAS-Pdk14*”" (ref. 46), 
UAS-foxo (ref. 47), UAS-Der2 (VDRC), UAS-IIp7 (ref. 48), UAS-IIp1, UAS-IIp2, 
UAS-IIp3, UAS-IIp4, UAS-IIp5, UAS-IIp6,  Ilp3-nLacZ?*P?""!— and Ilp3- 
nLacZ?*?3!!-1 (both recapitulate endogenous I/p3 expression in L3 mNSCs, ref. 
49), UAS-shiP (ref. 50), NB driver = nab-GAL4*?*™ (ref. 51), cortex glia 
driver = NP577-GAL4 and ensheathing glia driver = NP6520-GAL4 (ref. 52), 
Ilp6-GAL4 = NP1079-GAL4 (NIG stock centre), subperineurial BBB glia driver = 
moody-GAL4 (ref. 53), midline glia driver = slit-GAL4 (ref. 54), midline glia/ 
neuronal driver = sim-GAL4 (ref. 55), mNSC driver = Ilp2-GAL4 (ref. 56), pan- 
neuronal driver = n-syb-GAL4 (ref. 57), Ubx-GAL4 (ref. 58), wg-GAL4 (ref. 59), 
en-GAL4 was a gift from A. Brand via J.-P. Vincent, repo-FLP (ref. 60). 

EdU detection, immunostaining, in situ hybridization and imaging. L1 and L2 
tissues were immobilized on poly-L-Lysine-coated slides for all stainings, except for 
CNS explants. For EdU experiments, dissected CNSs were incubated for 1 hin 10 1M 
EdU/PBS, fixed for 15 min in 4% formaldehyde/PBS, followed by detection of Alexa 
Fluor azide according to the manufacturer’s instructions (Click-iT EdU Imaging Kit, 
Invitrogen) and washing in 0.1% Triton/PBS. Antibody staining and in situ hybrid- 
ization were performed according to standard protocols. Primary antibodies used in 
this study were: rabbit anti-B-Galactosidase (Molecular Probes) 1/2,000; rabbit anti- 
GFP (Invitrogen) 1/1,000; mouse anti-Repo 1/20; mouse anti-Miranda 1/20 and rat 
anti-Elav 1/100 (Developmental Studies Hybridoma Bank); pre-adsorbed alkaline- 
phosphatase-conjugated sheep anti-digoxigenin (Roche) 1/2,000. Secondary 
antibodies used were: F(ab’), fragments conjugated to either Alexa-Fluor-488, 
Alexa-Fluor-633 (Molecular Probes) or Cy3 (Jackson), used at 1/250-1/2,000. Live 
tissues were photographed in PBS. Fixed tissues labelled for fluorescence microscopy 
were mounted in Vectashield (Vector Laboratories) whereas those processed for in 
situ hybridization were mounted in 80% glycerol. Fluorescent images were acquired 
with a Leica SP5 confocal microscope (LAS AF software) and bright-field images 
were acquired with a Zeiss Axiophot2 microsope (AxioVision software). Images of 
the whole CNS are projections of a 1.5-1m-step z-series. Images of fat body and of 
high-magnification double-labels of parts of the CNS are single sections except for the 
right panel of Fig. 3e, which is a projection of 13 sections from a z-series. 

CNS explant cultures. Explanted CNSs from larvae hatched within a 2-h window 
were cultured for 3-4 days on 8-t1m pore-size inserts in 10 ,1M EdU in Schneider’s 
medium, 10% fetal calf serum, 2 mM L-glutamine (Gibco) and 1X Pen Strep (Gibco), 
in 24-well Transwell plates (Costar) placed in a humidified chamber at 25 °C. 
Quantification of EdU incorporation. The ‘thoracic’ region used for EdU quan- 
tifications corresponds to the ventral nerve cord from brain-lobe level down to A1/ 
A2, distinguishable from more posterior neuromeres by a sharp transition in 
neuroblast density (Fig. 1a). The numbers of EdU* voxels per CNS were deter- 
mined using Volocity (Improvision) from Leica SP5 confocal microscope scans 
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(LAS AF software) using a 1.5-,1m-step z-series. An average of ten CNSs were 
quantified per experimental genotype and controls (siblings or half-siblings) were 
processed in parallel. Control and experimental values were normalized using the 
average number of control EdU~ voxels. For all histograms, error bars represent 
standard error of the mean (s.e.m.) of normalized values and asterisks indicate 
P<0.05 using two-tailed Student’s t-tests with equal sample variance. 

Larval mass measurements. Wet weights were determined for wandering L3 male 
larvae, sexed and genotyped in PBS, dabbed dry with tissue and transferred to pre- 
weighed microfuge tubes. For each data point, triplicate samples, each containing 
an average of 50 animals per genotype were weighed (Precisa XB 120A balance). 
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The histone methyltransferase SETDB1 is recurrently 
amplified in melanoma and accelerates its onset 
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The most common mutation in human melanoma, BRAF(V600E), 
activates the serine/threonine kinase BRAF and causes excessive 
activity in the mitogen-activated protein kinase pathway’. 
BRAF(V600E) mutations are also present in benign melanocytic 
naevi’, highlighting the importance of additional genetic altera- 
tions in the genesis of malignant tumours. Such changes include 
recurrent copy number variations that result in the amplification 
of oncogenes**. For certain amplifications, the large number of 
genes in the interval has precluded an understanding of the coop- 
erating oncogenic events. Here we have used a zebrafish melanoma 
model to test genes in a recurrently amplified region of chro- 
mosome | for the ability to cooperate with BRAF(V600E) and 
accelerate melanoma. SETDB1, an enzyme that methylates histone 
H3 on lysine 9 (H3K9), was found to accelerate melanoma forma- 
tion significantly in zebrafish. Chromatin immunoprecipitation 
coupled with massively parallel DNA sequencing and gene expres- 
sion analyses uncovered genes, including HOX genes, that are 
transcriptionally dysregulated in response to increased levels of 
SETDB1. Our studies establish SETDB1 as an oncogene in mela- 
noma and underscore the role of chromatin factors in regulating 
tumorigenesis. 

To identify genes that promote melanoma, we focused on genomic 
regions that are subject to copy number amplification in human tumour 
samples. Ina study of 101 cell lines and short-term cultures of melanoma 
cells, chromosome 1q21 (chr1: 147.2-149.2 megabases) was identified as 
a recurrently amplified interval® (Fig. 1a). The same region was impli- 
cated in another comprehensive analysis of copy number variation in 
melanoma’. To test candidate genes from this interval for the ability to 
accelerate melanoma, we developed an assay in transgenic (Tg) zebrafish 
in which BRAF(V600E) is expressed under the control of a melanocyte- 
specific gene (mitfa) promoter on a p53 (also known as tp53) mutant 
background (p53 ‘~) (Supplementary Fig. 1). Melanomas and melano- 
cytes that develop in Tg(mitfa:BRAF(V600E)); p53 '~ zebrafish’ are 
suppressed by a mitfa ‘~ mutation. We engineered a transposon- 
based vector called miniCoopR that rescues melanocytes and melano- 
mas in a Tg(mitfa:BRAF(V600E)); p53 '~; mitfa'~ strain and drives 
the expression of a candidate gene in these rescued tissues. We iden- 
tified genes that were present in the human 1q21 region and were 
overexpressed as messenger RNAs in 1q21-amplified melanomas 
based on Affymetrix microarrays. Candidate human genes were cloned 
into the miniCoopR vector and injected into one-cell stage 
Tg(mitfa:BRAF(V600E)); p53 '~; mitfa~'~ zebrafish embryos. Tumour 
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Figure 1 | SETDB1 accelerates melanoma formation in zebrafish. a, Left, 
significance of copy number amplification along chromosome 1 in human 
melanoma samples, as assessed by using the algorithm GISTIC (genomic 
identification of significant targets in cancer)*”°. Significance values were 
determined by the false discovery rate (FDR) test. Right, copy number profiles 
in the human 1q21 interval in melanoma samples (vertical bars). The positions 
of SETDB1 (dashed line) and MCL1 (arrowhead) are indicated. Mb, megabase. 
b, The Tg(mitfa:BRAF(V600E)); p53‘; mitfa_'~ strain (top) was injected 
with miniCoopR-cloned candidate oncogenes. In animals injected with 
miniCoopR-SETDB1 (bottom), the melanocytes are rescued, and melanomas 
(arrow) rapidly develop. c, Melanoma-free survival curves for 
Tg(mitfa:BRAF(V600E)); p53‘; mitfa '~ zebrafish injected with 
miniCoopR-SETDB1 (weighted average of 2 independent experiments, n = 70) 
or miniCoopR-EGFP (weighted average of 3 independent experiments, 

n= 125). 
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incidence curves for the resultant adults showed that one gene in 
this interval, SETDB1, significantly accelerated melanoma onset 
(P=9.4X 10’, logrank chi-squared test; Fig. 1b, c and Supplemen- 
tary Fig. 2). 

As in melanoma, SETDB1 is focally amplified in non-small-cell lung 
cancer, small-cell lung cancer, ovarian cancer, hepatocellular carcinoma 
and breast cancer (Supplementary Fig. 3). The anti-apoptotic gene 
MCLI resides near SETDBI1 in the 1q21 interval, and knockdown of 
MCLI has been shown to diminish non-small-cell lung cancer pro- 
liferation and xenograft outgrowth*. However, MCL1 is not over- 
expressed in the 1q21-amplified melanoma samples, so it was not 
tested in this study. No other gene accelerated the onset of melanomas, 
suggesting that SETDB1 is a crucial gene that is amplified in the chro- 
mosome 1q21 interval. Using fluorescence in situ hybridization, we 
observed SETDB1 amplification in short-term cultures of human 
melanoma cells (Supplementary Fig. 4), directly confirming the 
array-based copy number data from which our study originated. 

Melanomas overexpressing SETDB1 were more aggressive than 
tumours overexpressing enhanced green fluorescent protein (EGFP) when 
analysed at an equivalent stage and in the same T'g(mitfa:BRAF(V600E)); 
p53 '-; mitfa'~ genetic background. The melanomas expressing 
SETDB]1 were more locally invasive than the EGFP control melanomas 
(Fig. 2a; 94% (overexpressing SETDB1; n = 18) versus 53% (expressing 
EGFP; n= 17) of melanomas invaded the muscle (P = 1.6 X 10°, 
Fisher’s exact test), and 89% (SETDB1) versus 35% (EGFP) invaded 
the spinal column (P = 7.2 X 10 °, Fisher’s exact test)). MiniCoopR- 
SETDB1 transgenic melanomas had more extensive nuclear pleo- 
morphism and larger nuclei than control tumours (Supplementary 
Fig. 5). By contrast, miniCoopR-SETDB1 tumours showed similar 
levels of BRAF protein to control tumours, indicating that SETDB1 
did not accelerate melanoma formation by altering expression of the 
BRAF(V600E) transgene (Supplementary Fig. 6). 

Melanocytes overexpressing SETDB1 grew in confluent patches in zeb- 
rafish, unlike melanocytes in the EGFP-overexpressing control zebrafish, 
which grew in a wild-type stripe pattern. We analysed the genetic inter- 
actions that are responsible for these pigmentation differences. SETDB1- 
expressing melanocytes in the Tg(mitfa:BRAF(V600E)); mitfa '~ 
strain formed confluent patches, but SETDB1-expressing melanocytes 
in the p53'~; mitfa~'~ strain grew in a striped pattern (Fig. 2b). 
Although SETDB1 and BRAF(V600E) cooperated to override normal 
pigment patterning, no tumours arose in miniCoopR-SETDB1- 
injected Tg(mitfa:BRAF(V600E)); mitfa'~ zebrafish, indicating that 
SETDB1 overexpression does not have the same effect as loss of p53 in 
tumour formation. 

BRAF(V600E) induces senescence in human naevi and in cultured 
mammalian melanocytes’, and we suspected that the pigmenta- 
tion differences might result from a failure of senescence and excess 
melanocyte proliferation caused by SETDB1. Using senescence- 
associated f-Galactosidase (SA-B-Gal) staining’*"’, we confirmed that 
BRAF(V600E) induces senescence in zebrafish melanocytes in vivo 
(Supplementary Fig. 7a—c). We stained miniCoopR-rescued melano- 
cytes and found SETDB1-expressing melanocytes to be less senescent 
than those expressing only EGFP (Fig. 2c). SETDB1-expressing mela- 
nocytes also lacked the flattened morphology of senescent cells (Sup- 
plementary Fig. 7d). These results suggest that SETDB1 overexpression 
may contribute to melanoma formation by abrogating oncogene- 
induced senescence. 

To understand the gene expression changes that occur when SETDB1 
is overexpressed, we performed microarray analyses of zebrafish mela- 
nomas. We defined a gene signature comprising 67 human orthologues 
of genes that are downregulated in SETDB1-overexpressing zebrafish 
melanomas (Fig. 3a) and tested the relationship between this signa- 
ture and SETDB1 expression in human melanomas. Using gene set 
enrichment analysis (GSEA)’*"’, we found that the gene signature was 
inversely correlated with SETDB1 expression across a panel of 93 cell 
lines and short-term cultures of melanoma cells (Fig. 3b). SETDB1 
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Figure 2 | Effects of SETDB1 on melanoma cells and melanocytes. 

a, Transverse sections of zebrafish melanomas at 2 weeks post onset of 
melanoma, visualized by staining with haematoxylin and eosin. At this time 
point, dorsal miniCoopR-EGFP melanomas (left) have exophytic growth, 
whereas miniCoopR-SETDB1 melanomas (right) have invaded from the skin, 
through the collagen-rich stratum compactum (sc) of the dermis, into the 
underlying musculature. Scale bar, 70 um. b, SETDB1 interacts with 
BRAF(V600E), affecting the pigmentation pattern, but the p53~/~ mutation is 
required for melanoma formation. miniCoopR-EGFP or miniCoopR-SETDB1 
was injected into the indicated transgenic strains. The photographs indicate 
pigmentation pattern differences before the time point at which melanomas 
begin to form in the Tg(mitfa:BRAF(V600E)); p53 '~;mitfa-'~ background. 
Percentages indicate the melanoma incidence at 12 weeks of age; n = number of 
fish. c, SETDB1 abrogates BRAF(V600E)-induced senescence. Left, brightfield 
pseudocoloured photomicrographs of SA-B-Gal staining performed on scale- 
associated melanocytes. Centre and right, fluorescent photomicrographs of the 
same melanocytes. In this experiment, miniCoopR-rescued melanocytes 
express mitfa-promoter-driven EGFP (centre) and MITFA (right). Multiple 
nuclei (arrowheads) are present in BRAF(V600E)-expressing melanocytes. The 
percentage of senescent melanocytes is indicated (P = 7.3 X 10 *', chi-squared 
test); n = number of cells. Scale bar, 10 um. 


overexpression led to a broad pattern of transcriptional changes, 
including conserved downregulation of a group of genes that is 
enriched for HOX genes and for transcriptional regulators. 

To identify the direct targets of SETDB1 across the genome in mel- 
anoma cells, we performed chromatin immunoprecipitation followed 
by massively parallel sequencing (ChIP-seq). We identified SETDB1 
targets from WM262, a short-term culture of melanoma cells with 
high levels of SETDB1 expression, and from WM451Lu, a short-term 
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Figure 3 | SETDB1 target gene regulation and histone methyltransferase 
complex formation. a, Heat-map of genes that are downregulated in zebrafish 
melanomas overexpressing SETDB1 compared with control (EGFP- 
expressing) melanomas. b, | Graphical representation of the rank-ordered 
gene list derived from a panel of short-term cultures of human melanoma cells 
and stratified on the basis of SETDB1 expression level. The enrichment score is 
calculated based on a running-sum statistic, which increases when a gene 
(vertical line) in the gene set is encountered and decreases when one is not. 
GSEA shows that human orthologues of SETDB1-downregulated zebrafish 
genes are similarly downregulated in human melanomas as the level of SETDB1 
increases (enrichment score (ES) = —0.37, normalized enrichment score 
(NES) = — 1.47, q value = 0.034 (FDR test), P = 0.034). The arrows indicate 
the positions of HOX genes. c, SETDB1 and H3K9me3 ChIP-seq profiles at the 
HOXA locus in human melanoma cells. The number of sequence reads is 
shown on the y axis. SETDB1 and overlapping H3K9me3 are present at the 
HOXA locus in WM262 cells but largely absent in WM451Lu cells. kb, 


culture of melanoma cells with low levels of SETDB1 (Supplementary 
Fig. 8). These short-term cultures harbour the BRAF(V600E) mutation 
(Supplementary Fig. 9), and their proliferation is sensitive to changes 
in SETDB1 levels (Supplementary Figs 10 and 11). In murine embry- 
onic stem cells, SETDB1 binds to the promoters of genes encoding 
developmental regulators, including Hox genes'*. We observed differ- 
ential binding of SETDB1 to genes in the HOXA cluster in melanoma 
cell lines with high and low levels of SETDB1 expression; SETDB1 is 
bound to HOXA genes in WM262 cells, whereas there is minimal 
binding in WM451Lu cells (Fig. 3c and Supplementary Tables 1 and 
2). SETDB1 catalyses the trimethylation of histone H3 lysine 9 
(H3K9me3), thereby promoting repression of its target genes. ChIP- 
seq for the H3K9me3 mark showed that H3K9me3 is present at the 
HOXA locus in WM262 cells but absent in WM451Lu cells (Fig. 3c). 
HOX gene expression is inversely correlated with SETDB1 levels in 
short-term cultures of melanoma cells (Fig. 3b), suggesting that 
enhanced target gene binding and repression may have a role in the 
SETDB1-mediated acceleration of melanoma onset. Additional ChIP- 
seq; for MCAF1 (also known as AM and ATF7IP; a methyltransferase- 
stimulatory cofactor of SETDB1)"° in WM262 cells, suggests that the 


kilobases. d, Melanoma-free survival curves for Tg(mitfa:BRAF(V600E)); 
p53"; mitfa~’~ zebrafish expressing SUV39HI (P = 6.74 X 10° versus 
miniCoopR-EGFP, logrank chi-squared test) or expressing the 
methyltransferase-deficient SETDB1 variants SETDB1(H1224K) (P = 0.24 
versus miniCoopR-SETDB1, and P = 8.4 X 10° versus miniCoopR-EGFP) or 
SETDB1(C1226A) (P = 0.20 versus miniCoopR-SETDB1, and P = 1.3 X 10" 
versus miniCoopR-EGFP). e, In vitro reconstitution of methyltransferase 
complexes containing SETDB1 and SUV39H1 variants. The sequential 
purification of glutathione S-transferase (GST)-tagged SUV39H1, Flag-tagged 
GLP and haemagglutinin (HA)-tagged G9A proteins was followed by western 
blotting using antibodies specific for the proteins and protein tags indicated on 
the left. Sequential purifications indicate that mutant SETDB1 proteins co- 
purify in a methyltransferase complex, as does wild-type (WT) SETDB1. 

f, Histone methylation assays of complexes purified from C2C12 myoblast cells. 
Complexes containing WT or mutant SETDB1 can catalyse the transfer of 
radiolabelled methyl groups to histone H3, as detected by fluorography. 


effects of SETDB1 overexpression are mediated in part by MCAF1 
(Supplementary Fig. 12). 

We assayed the effects of SETDB1 overexpression on target genes 
by infecting WM451Lu cells with a SETDB1-expressing lentivirus. 
Using SETDB1 ChIP-seq data from WM451Lu cells, we found that 
SETDB1-bound targets are significantly enriched in downregulated 
genes but not upregulated ones (Supplementary Fig. 13 and Sup- 
plementary Table 3), suggesting that a major consequence of SETDB1 
amplification is repression of SETDB1-bound target genes. However, 
many SETDB1 target genes in both WM451Lu and WM262 short-term 
cultures of melanoma cells are not methylated, and additional analyses 
show a relationship between increasing SETDB1 levels and increasing 
expression of many SETDB1 target genes (Supplementary Fig. 14). 

To obtain a mechanistic insight into the role of SETDB1 in regulating 
gene expression, we undertook genetic and biochemical studies that 
evaluate methyltransferase activity. Recently, a complex containing 
SETDB1 and the H3K9 methyltransferases SUV39H1, G9A (also 
known as EHMT2) and GLP (also known as EHMT1) was discovered"®. 
To examine the possibility that other methyltransferases act together 
with SETDB1 to modulate melanoma onset, we tested whether 
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SUV39HI1 could accelerate melanoma formation in zebrafish. As was 
the case for SETDB1, overexpression of SUV39H1 led to the formation 
of confluent melanocyte patches, and it accelerated melanoma onset 
(Fig. 3d). We next examined the consequences of mutations that render 
SETDB1 enzymatically inactive. Enzymatically deficient SETDB1 was 
capable of incorporating into the methyltransferase complex in vitro 
(Fig. 3e) and in vivo (Supplementary Fig. 15). Furthermore, in the 
context of enzymatically deficient SETDB1, the complex retained 
methyltransferase activity (Fig. 3f and Supplementary Fig. 16) and 
binding site localization (Supplementary Fig. 17). Last, the melanoma 
incidence curves for two methyltransferase-deficient SETDB1 mutants 
were similar to each other and to the melanoma incidence curve for 
zebrafish that overexpress wild-type SETDB1 (Fig. 3d). Our studies sug- 
gest a model in which activity of the methyltransferase complex contain- 
ing SETDB1 and SUV39H1 alters gene expression in a way that leads to 
the acceleration of melanoma onset and to increased invasiveness. 

To determine the extent of SETDB1 overexpression in human 
melanomas, and to examine potential clinical implications, we per- 
formed immunohistochemistry on melanoma tissue microarrays. 
After confirming antibody specificity (Supplementary Fig. 18), we 
observed high levels of SETDB1 expression in 5% of normal melano- 
cytes (n = 20), 15% of benign naevi (n = 20) and 70% of malignant 
melanomas (n = 91) (Fig. 4). On the basis of our observations of pre- 
malignant melanocytic lesions in zebrafish, we speculate that human 
naevi that overexpress SETDB1 are more likely to undergo oncogenic 
progression than naevi with basal levels of SETDB1 expression. These 
data indicate that the majority of malignant human melanomas over- 
express the SETDB1 protein. 

In this study, we adapted the zebrafish as a platform for cancer 
gene discovery. Through the creation and analysis of more than 3,000 


SETDB1 


Melanoma |, * 


Naevus 


Skin fe 


= Low 
= Intermediate 
= High 


Figure 4 | High level expression of SETDB1 protein is common in human 
melanomas but not naevi or normal melanocytes. Immunohistochemical 
staining of SETDB1 (left, purple) and haematoxylin and eosin (H+E) staining 
(centre). Arrowheads indicate melanocytes in normal skin samples. SETDB1 
expression (right, measured as described in the Methods) was scored for 
malignant melanomas (top, m = 91), naevi (centre, nm = 20) and normal skin 
(bottom, n = 20). The percentage of samples with a low, intermediate or high 
level of SETDB1 staining is indicated. Summarized data and raw data from 
experiments with two different antibodies are described in Supplementary 
Tables 4 and 5, respectively. Scale bar, 30 um; insets are magnified 2.5. 
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transgenic animals, SETDB1 was identified as a gene capable of 
accelerating melanoma formation in cooperation with BRAF(V600E). 
Amplification of the 1q21 chromosomal interval in melanoma does not 
preferentially occur together with the BRAF(V600E) mutation 
(P= 0.28, two-sample t-test). Therefore, it is probable that the 
tumour-promoting activity of SETDB1 does not exclusively depend 
on BRAF(V600E), which is common in melanomas but is found less 
frequently in other tumour types that have 1q21 amplification. SETDB1 
forms a multimeric complex with SUV39H1 and other H3K9 methyl- 
transferases. On the basis of our findings, we speculate that SETDB1 
overexpression can increase the activity of the H3K9 methyltransferase 
complex, leading to alterations in its target specificity. Inactivating 
mutations in histone methyltransferases and histone demethylases were 
recently described in renal cell carcinoma'”’*. Our study lends func- 
tional support to the idea that perturbation of histone methylation 
promotes cancer. Moreover, SETDB1 is focally amplified in a broad 
range of malignancies, suggesting that alterations in histone methyl- 
transferase activity could define a biologically related subset of cancers. 


METHODS SUMMARY 


miniCoopR assay. The miniCoopR vector was constructed by inserting a zebrafish 
mitfa minigene (consisting of promoter, open reading frame and 3'-untranslated 
region) into the BglII restriction site of the plasmid pDestTol2pA2 (ref. 19). 
Individual miniCoopR clones were created by MultiSite Gateway recombination 
(Invitrogen) using human full-length open reading frames. Twenty-five picograms 
of each miniCoopR-candidate clone and 25 pg mRNA encoding the Tol2 transposase 
were microinjected into one-cell zebrafish embryos generated from an incross of 
Tg(mitfa:BRAF(V600E)); p53! a mitfa ! ~ zebrafish. Rescued animals were scored 
weekly for the presence of visible tumours. 

Senescence assay. SA-f-Gal staining was performed as described previously’, 
except that scales plucked from the dorsum of melanocyte-rescued zebrafish were 
stained instead of tissue sections. This assay was performed on an albino(b4) 
mutant background so that melanin pigment would not obscure SA-B-Gal stain- 
ing. Experimental animals were injected with 20 pg miniCoopR-SETDB1 plus 
10pg miniCoopR-EGFP, and control animals were injected with 30pg 
miniCoopR-EGFP. Rescued melanocytes were recognized as EGFP-positive cells. 
Gene expression. From zebrafish, total RNA was extracted from four 
miniCoopR-SETDB1 melanomas and four miniCoopR-EGFP melanomas. Total 
RNA from each was used for the synthesis of CDNA, which was hybridized to a 
385K microarray (NimbleGen 071105_Zv7_EXPR). Zebrafish genes that were 
downregulated by SETDB1 were selected by a fold change of >5 (when comparing 
the level of expression in miniCoopR-EGFP melanomas and the level in 
miniCoopR-SETDB1 melanomas) and then filtered by a ‘SETDB1 specificity 
score’, which was defined as a fold change of >3 when comparing the level of 
expression in Tg(mitfa:BRAF(V600E)); p53 | melanomas with that of 
miniCoopR-SETDB1 melanomas. 

Immunohistochemistry. Human melanoma tissue microarrays were indepen- 
dently analysed for SETDB1 protein by immunohistochemistry, using a rabbit 
polyclonal antibody (Sigma HPA018142, at a 1/200 dilution) and a mouse mono- 
clonal antibody (4A3, Sigma WH0009869M7, 1/400 dilution), with a purple sub- 
strate for the secondary antibody (VIP substrate, Vector Labs). A methyl green 
counterstain was used. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


miniCoopR assay. The miniCoopR vector was constructed by inserting a zebrafish 
mitfa minigene (consisting of promoter, open reading frame and 3’-untranslated 
region) into the BglII site of the plasmid pDestTol2pA2 (ref. 19). Individual 
miniCoopR clones were created by MultiSite Gateway recombination (Invitrogen) 
using human full-length open reading frames. Recombination junctions were 
sequence verified. Twenty-five picograms of each miniCoopR-candidate clone and 
25 pg mRNA encoding the Tol2 transposase were microinjected into one-cell zebra- 
fish embryos generated from an incross of Tg(mitfa:BRAF(V600E)); p53‘; mitfa '~ 
zebrafish. Transgenic animals were selected based on the presence of rescued mela- 
nocytes at 48 h post fertilization. Rescued animals were scored weekly for the presence 
of visible tumours. 

Tumour invasion assay. Zebrafish with dorsal melanomas between the head and 
dorsal fin were isolated, and tumours were allowed to progress for 2 weeks, at 
which time the animals were killed. Tumours were formalin fixed, embedded and 
sectioned transversely to assess invasion. 

Senescence assay. SA-B-Gal staining was performed as described previously"’, 
except that scales plucked from the dorsum of melanocyte-rescued zebrafish were 
stained instead of tissue sections. This assay was performed on an albino(b4) 
mutant background so that melanin pigment would not obscure SA-B-Gal stain- 
ing. Experimental animals were injected with 20 pg miniCoopR-SETDB1 plus 
10pg miniCoopR-EGFP, and control animals were injected with 30pg 
miniCoopR-EGFP. Rescued melanocytes were recognized as EGFP-positive cells. 
Gene expression and GSEA. From zebrafish, total RNA was extracted from four 
miniCoopR-SETDB1 melanomas and four miniCoopR-EGFP melanomas. Total 
RNA from each was used for the synthesis of cDNA, which was hybridized to a 
385K microarray (NimbleGen 071105_Zv7_EXPR). Zebrafish genes that were 
downregulated by SETDB1 were selected by a fold change of >5 (when comparing 
the level of expression in miniCoopR-EGFP melanomas and the level in 
miniCoopR-SETDB1 melanomas) and then filtered by a ‘SETDB1 specificity score’, 
which was defined as a fold change of >3 when comparing the level of expression in 
Tg(mitfa:BRAF(V600E)); p53! ~ melanomas with that of miniCoopR-SETDB1 
melanomas. Human orthologues of SETDB1-downregulated genes were identified 
for GSEA (http://www.broadinstitute.org/gsea/). For GSEA of SETDB1-downre- 
gulated and SETDB1 ‘bound-bound’ genes, a rank-ordered gene list was derived 
from the expression profiles of 93 melanoma cell lines and short-term cultures of 
melanoma cells*, using SETDB1 expression level as a continuous variable. In 
WM451Lu cells, the dose of SETDB1 lentiviral infection was titrated to achieve 
SETDB1 expression levels comparable to those in short-term cultures of melanoma 
cells with high levels of SETDB1 expression. Total RNA was extracted and then 
amplified and hybridized to a Human Gene 1.0 ST Array (Affymetrix). Control 
gene expression values were obtained from WM451Lu cells infected with EGFP- 
expressing lentivirus. 

ChIP. ChIP was performed from short-term cultures of WM262 and WM451Lu 
cells, and ChIP-seq data were analysed as described previously". 
Methyltransferase complex reconstitution. In vitro-translated Flag-GLP, HA- 
G9A and untagged SETDB1 (wild type (WT) or the indicated mutant) were 
incubated for 4h at 4°C with 5mg GST, GST-SUV39H1(WT) or GST- 
SUV39H1(H324K) mutant immobilized on agarose-glutathione beads. Beads 
were then extensively washed, as described previously'®, and protein complexes 
were eluted with free glutathione. The eluate was then subjected to an overnight 
Flag immunoprecipitation at 4°C using Flag—agarose. After extensive washing, 
protein complexes were eluted with 0.1 M glycine, pH 3.0. The glycine was then 
neutralized with NaOH, and the eluate was renatured for 1 h at room temperature 
then incubated overnight at 4 °C with HA-resin. The HA-resin was then washed, 
and the protein complexes were eluted with SDS. Ten per cent of the input and 
100% of the HA eluate were resolved by SDS-PAGE and were analysed by western 
blotting with the indicated antibodies. The top of the membrane was revealed with 
three different antibodies (anti-SETDB1, anti-HA and anti-Flag antibody) using 
two stripping steps. 

Histone methylation assay. Purified complexes were incubated with 5 mg core 
histones (Upstate 13-107) and 1.5mCi S-adenosyl-1-[methyl-*H]methionine 
(PerkinElmer NET155050UC) in a buffer containing 50mM Tris, pH 8.0, 
100mM NaCl, 1% NP40, 1mM dithiothreitol and protease inhibitors (with a 
reaction volume of 30 ul). The mixture was incubated for 1h at 30°C and was 
then separated by SDS-PAGE. The gel was stained with a SimplyBlue SafeStain kit 


(Invitrogen) and analysed by fluorography using an FLA-7000 phosphorimager 
(Fuji). 

Immunohistochemistry. Human melanoma tissue microarrays were indepen- 
dently analysed for SETDB1 protein by immunohistochemistry, using a rabbit 
polyclonal antibody (Sigma HPA018142, at a 1/200 dilution) and a mouse mono- 
clonal antibody (4A3, Sigma WH0009869M7, 1/400 dilution), with a purple sub- 
strate for the secondary antibody (VIP substrate, Vector Labs). A methyl green 
counterstain was used. Melanoma tissue microarrays were obtained from US 
Biomax (ME1003 and ME482). A modified visual semiquantitative method was 
used to score staining as described previously”’, using a two-score system for 
immunointensity (II) and immunopositivity (IP). Il and IP were then multiplied. 
SETDB1 immunostaining was also performed on formalin-fixed, paraffin-embedded 
zebrafish melanomas. 

Fluorescence in situ hybridization (FISH). The bacterial artificial chromosome 
(BAC) clone, RP11-42A12, used as the SETDBI1 probe was selected using the 
UCSC Genome Browser and obtained from the BACPAC Resource Center 
(CHORI). BAC probe preparation, labelling and hybridization were performed 
as described previously”. The SpectrumOrange-CEP1 reference probe 06J39-026 
was obtained from Abbott Molecular. 

Lentivirus infection. We used pLKO1-puromycin lentiviral vectors carrying 
short hairpin RNAs (shRNAs) specific for SETDB1 or GFP sequences. The 
shRNA vectors were obtained from the Broad Institute RNAi Consortium 
(http://www. broadinstitute.org/rnai/trc), and the lentiviruses were obtained by 
overnight triple co-transfection of 293T cells using lipofectamine 2000 and 3 pig 
pLKO.1-shRNA, A8.9 and VSV-G vectors (in 100-mm plates). The SETDB1 
shRNA construct used was TRCN0000148112 (hairpin target sequence gctcagat 
gataacttctgta). Knockdown efficiency was determined by RT-PCR and by immu- 
noblot analysis using SETDB1-specific primers and antibody, respectively. At days 
2 and 3 post infection, supernatants were collected and filtered with a 45-um filter 
to remove 293T cells. Virus was added to cells (plated to attain 30-40% confluence 
at the time of infection) and incubated for 16h in the presence of polybrene at 
5 ug ml’. After infection, virus was removed and fresh media added. Forty-eight 
hours post infection, cells were subjected to a 3-day puromycin selection. To 
elevate SETDB1 expression in WM451Lu cells, we used a pLEX980 lentiviral 
vector into which the wild-type SETDB1 open reading frame had been recom- 
bined. Infection, selection and monitoring of SETDB1 concentration were per- 
formed as described above. 

Cell proliferation assays. Cells were plated in 12-well plates at 20,000 cells well”? 
in 2ml medium. At each time point, cells were trypsinized from 3 wells and 
counted with a cell counter (Beckman Coulter). 

Western blot analyses. Western blots were performed with primary antibodies 
recognizing SETDB1 (Abcam ab12317), BRAF (Santa Cruz Biotechnology 
sc5284), o-tubulin (Cell Signaling Technology 2144), SUV39H1 (Upstate 07- 
550), the Flag epitope (Sigma M2), G9A (Clinisciences D141-3), and GLP 
(R&D Systems PP-B0422-00). HRP-conjugated anti-rabbit and anti-mouse sec- 
ondary antibodies (Amersham) were used. Thirty micrograms total protein was 
loaded per lane. SETDB1 was recognized as a doublet at approximately 150 kDa 
(predicted molecular mass 143 kDa). 

p53BP2 quantitative PCR analysis. Chromatin extracts from HeLa cells were 
prepared as described"®, using micrococcal nuclease digestion and mild sonication 
(without any formaldehyde crosslinking) to enrich them in mono-nucleosomes. 
Flag-HA purification was then performed. A fraction of the input or 1/3 of each 
Flag-HA purified complex (chromatin associated) was treated with proteinase K 
and RNase. Samples were then phenol-chloroform extracted, and DNA was pre- 
cipitated using ethanol. p53BP2 primers were described previously’. Results were 
normalized to input values and presented as fold enrichment compared with the 
control sample (Flag-HA purification from HeLa control cells). 
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DHODH modulates transcriptional elongation in the 
neural crest and melanoma 
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Melanoma is a tumour of transformed melanocytes, which are 
originally derived from the embryonic neural crest. It is unknown 
to what extent the programs that regulate neural crest development 
interact with mutations in the BRAF oncogene, which is the most 
commonly mutated gene in human melanoma’. We have used zebra- 
fish embryos to identify the initiating transcriptional events that 
occur on activation of human BRAF(V600E) (which encodes an 
amino acid substitution mutant of BRAF) in the neural crest lineage. 
Zebrafish embryos that are transgenic for mitfa:BRAF(V600E) and 
lack p53 (also known as tp53) have a gene signature that is enriched 
for markers of multipotent neural crest cells, and neural crest pro- 
genitors from these embryos fail to terminally differentiate. To 
determine whether these early transcriptional events are important 
for melanoma pathogenesis, we performed a chemical genetic screen 
to identify small-molecule suppressors of the neural crest lineage, 
which were then tested for their effects on melanoma. One class of 
compound, inhibitors of dihydroorotate dehydrogenase (DHODH), 
for example leflunomide, led to an almost complete abrogation of 
neural crest development in zebrafish and to a reduction in the self- 
renewal of mammalian neural crest stem cells. Leflunomide exerts 
these effects by inhibiting the transcriptional elongation of genes 
that are required for neural crest development and melanoma 
growth. When used alone or in combination with a specific inhibitor 
of the BRAF(V600E) oncogene, DHODH inhibition led to a marked 
decrease in melanoma growth both in vitro and in mouse xenograft 
studies. Taken together, these studies highlight developmental path- 
ways in neural crest cells that have a direct bearing on melanoma 
formation. 

In melanoma, it is unknown to what extent BRAF(V600E) muta- 
tions depend on transcriptional programs that are present in the 
developmental lineage of tumour initiation. These programs may be 
therapeutic targets when combined with BRAF(V600E) inhibition. 
We have used zebrafish embryos to identify small-molecule suppres- 
sors of neural crest progenitors that give rise to melanoma. Transgenic 
zebrafish expressing human BRAF(V600E) under the melanocyte- 
specific mitfa promoter, Tg(mitfa:BRAF(V600E)), develop melanoma 
at 4-12 months of age when crossed with p53 ‘~ mutant zebrafish, 
Tg(mitfa:BRAF(V600E)); p53! ~ (Fig. 1a). Because the mitfa promoter 
drives the expression of BRAF(V600E) from 16h after fertilization 
(a time point that overlaps with the expression of embryonic neural 
crest markers such as sox10), events that occur early in embryogenesis 
are analogous to those that occur at tumour initiation. To gain 
insight into these initiating events, we compared the gene expression 
profiles of Tg(mitfa:BRAF(V600E)); p53‘ embryos with those of 
Tg(mitfa:BRAF(V600E)); p53_‘~ melanomas by using gene set enrich- 
ment analysis (Fig. 1b). This approach uncovered a signature of 123 


overlapping genes, which is enriched for markers of embryonic neural 
crest progenitors (crestin, sox10 and ednrb (also known as ednrb1)) 
and melanocytes (tyr and dct) (see Supplementary Table 1 for full gene 
sets). The overlapping gene signature is similar to the signature of a 
multipotent neural crest progenitor, suggesting that the melanomas 
have adopted this cell fate. 

We analysed alterations in embryonic neural crest development by 
using in situ hybridization. At 24h post fertilization, Tg(mitfa:BRAF 
(V600E)); p53 | embryos show an abnormal expansion in the num- 
ber of crestin* progenitors, together with an increase in other markers 
from the 123 gene signature such as spry4 and rab3ill (Supplementary 
Fig. 1). By 72h post fertilization, crestin persists aberrantly in the head, 
tail and dorsal epidermis only in Tg(mitfa:BRAF(V600E)); p53 '~ 
embryos but not in embryos with either single mutation (Supplemen- 
tary Fig. 2a). The gene encoding Crestin is zebrafish specific? and is 
normally downregulated after the terminal differentiation of neural 
crest progenitors’. Our finding therefore suggests that activated 
BRAF(V600E) promotes the maintenance of multipotency in neural 
crest progenitors, which become expanded during tumorigenesis. In 
adult Tg(mitfa:BRAF(V600E)); p53" melanomas, almost all tumour 
cells, but no normal cells, were positive for crestin (Fig. 1c). Only 10- 
15% of the melanoma cells were pigmented (Supplementary Fig. 2b), 
which is consistent with the concept that adult zebrafish melanomas 
retain a progenitor-like state. A human melanoma tissue array yielded 
similar results: 52 out of 70 of the melanomas on the array (74.3%) were 
positive for the neural crest progenitor gene ednrb, but only 9 of 70 
(12.9%) were positive for the melanocyte lineage marker dct 
(Supplementary Fig. 3), in agreement with the finding that most human 
melanomas express the neural crest marker sox10 (ref. 4). These data 
indicate that the majority of human melanomas reflect events that lead 
to the maintenance of a neural crest progenitor phenotype’. 

We proposed that chemical suppressors of neural crest progenitors 
would be useful for treating melanoma. We screened 2,000 chemicals 
to identify compounds that inhibit the crestin* lineage during embryo- 
genesis. Most chemicals (90%) had a minimal effect or were toxic 
(Supplementary Fig. 4). NSC210627, a molecule of unknown function, 
strongly abrogated the expression of crestin (Fig. 2a, centre and left). 
The chemoinformatic algorithm DiscoveryGate® revealed similarity 
between NSC210627 and brequinar (Supplementary Fig. 5), an inhibitor 
of DHODH’. NSC210627 inhibited DHODH activity in vitro (Sup- 
plementary Fig. 6). Leflunomide, a DHODH inhibitor that is structurally 
distinct from NSC210627 (ref. 8), phenocopied NSC210627 (Fig. 2a, 
right) and was used for further studies because of its availability. 

We examined neural crest derivatives affected by leflunomide. Treated 
zebrafish embryos were devoid of pigmented melanocytes at 36-48 h 
post fertilization (Fig. 2b) and iridophores at 72h post fertilization 
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Figure 1 | Transgenic zebrafish melanoma and neural crest gene expression. 
a, Transgenic zebrafish expressing BRAF(V600E) under the control of the 
promoter of the melanocyte-specific gene mitfa, Tg(mitfa:BRAF(V600E)), 
develop pigmentation abnormalities and melanoma when crossed with p53 /~ 
fish. Their gross embryonic development is largely normal. hpf, hours post 
fertilization; WT, wild type. b, Gene expression analysis revealed a unique gene 
signature at 72 h post fertilization in the Tg(mitfa:BRAF(V600E)); p53" strain 
(left). Gene set enrichment analysis uncovered an enrichment between the 


(Supplementary Fig. 7a). dhodh inhibition led to a loss of ventral 
melanocytes in stage 38 Xenopus embryos (Supplementary Fig. 7b). 
Leflunomide treatment led to an almost complete loss of melanocyte 
progenitors at 24h post fertilization (Fig. 2c), a reduction in the num- 
ber of glial cells at 72 h post fertilization (Fig. 2d) and disruption of jaw 
cartilage at 72h post fertilization (data not shown). Leflunomide 
reduced the expression of sox10 and dct, which are expressed by neural 
crest progenitors and melanocytes, respectively, while leaving other 
lineages such as blood and notochord less affected (Supplementary 
Fig. 8). Microarray analysis of leflunomide-treated embryos showed 
downregulation of 49% of the genes that were upregulated in the 123- 
gene melanoma signature, and more than half of these are neural crest 
related (see Supplementary Table 2 for the complete list). 

The loss of several types of neural crest derivative suggested that 
leflunomide acts on neural crest stem cells. We tested leflunomide, and 
its derivative A77 1726, on neural crest stem cells isolated from the fetal 
(embryonic day 14.5) rat gut”’°. Both chemicals reduced the number of 
self-renewing neural crest stem cells in primary stem cell colonies, to 
27+5.35% (leflunomide) and 35+6.16% (A771726) of control num- 
bers (P < 0.0003 and P < 0.00007, respectively, Student’s t-test; Fig. 2e 
and Supplementary Fig. 9a). Colony size was also reduced compared 
with controls (by 18% and 24%, respectively; P< 0.02, Student’s 
t-test), but there was no effect on the differentiation or survival of 
specific progeny (Supplementary Fig. 9b, c). These results demonstrate 
that DHODH inhibitors negatively regulate the self-renewal of neural 
crest stem cells and have an affect on these cells in multiple species. 

DHODH catalyses the fourth step in the synthesis of pyrimidine 
nucleotides (NTPs)''. We noted striking morphological similarity 
between leflunomide-treated embryos and spt5/spt6 mutants’’, sug- 
gesting that leflunomide acts to suppress transcriptional elongation. 
In the spt5*8 null mutant, we found a lack of both crestin expression 
and pigmented melanocytes (similar to leflunomide-treated embryos) 
(Supplementary Fig. 10a). At 24h post fertilization, the gene expression 
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embryonic gene signature and the adult melanomas that form 4-12 months 
later (centre and right; see the Supplementary Information for full protocol 
details). Embryo heat-map columns represent an average of three clutches (log, 
scale, range —2-fold to +2-fold increase); adult heat-map columns represent 
individual fish (log, scale, range —10-fold to +10-fold increase). ¢, In situ 
hybridization of sagittal sections of WT and Tg(mitfa:BRAF(V600E)); p53 '~ 
adults reveal homogeneous crestin expression (blue) only within the dorsal 
melanoma; it is absent from normal adult tissues. 


profiles of spt5“* mutants and leflunomide-treated embryos!’ were 


nearly identical; of 223 genes downregulated after leflunomide treat- 
ment, 183 were similarly downregulated in spt5“* mutants (Sup- 
plementary Table 3 and Supplementary Fig. 10b). These downregu- 
lated genes include neural crest genes (crestin, sox10 and mitfa) and 
members of the notch pathway (her2 and dlb). We examined the inter- 
action of Dhodh with spt5 by incubating the hypomorphic spt5’"°”° 
mutant (which has only mild melanocyte defects)'* in low concentra- 
tions of leflunomide (3-5 1M) and then analysing the number of pig- 
mented melanocytes. Enhanced sensitivity to leflunomide was shown 
by spt5’"°°° embryos (Fig. 3a and Supplementary Fig. 11); at 3 uM 
leflunomide, 99% of mutant embryos had few or no melanocytes, 
compared with 0% of wild-type embryos (P = 0.000018, Kruskal- 
Wallis test; Supplementary Fig. 11b). These data confirm that 
DHODH inhibition affects transcriptional elongation, which is con- 
sistent with previous data demonstrating that a reduction in nucleotide 
pools in vitro leads to defects in elongation”. 

We assessed whether leflunomide specifically caused defects in the 
transcriptional elongation of genes required for neural crest develop- 
ment by using reverse transcription followed by quantitative PCR (Sup- 
plementary Fig. 10c and Supplementary Table 4). Leflunomide caused 
either no change or an increase in 5’ transcript abundance but a signifi- 
cant downregulation of 3’ transcripts of mitfa (for 5’ transcripts 
3.75+1.19-fold increase versus 0.39+0.07-fold increase for 3’ tran- 
scripts; P< 0.05, Student's t-test) and dlb (5' transcripts 1.13+0.14-fold 
increase versus 3’ transcripts 0.740.07-fold increase; P< 0.05). 
Leflunomide did not have a similar effect on control genes such as the 
gene encoding B-actin (5’ transcripts 1.03+0.07-fold increase versus 3’ 
transcripts 0.99+0.06-fold increase; P is not significant, Student's t-test). 
In the presence of leflunomide, transcription is initiated normally, but 
the transcripts accumulate and do not undergo productive elongation. 

To confirm that this mechanism is conserved in human melanoma, 
we performed chromatin immunoprecipitation using an antibody 
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Figure 2 | A chemical genetic screen to identify suppressors of neural crest 
development. a, A search for chemical suppressors of the crestin® lineage 
during embryogenesis identified NSC210627, a compound that completely 
abrogates the expression of crestin (which is normally present in the head, along 
the dorsum and in ventrally migrating neural crest cells), as shown by in situ 
hybridization (a, left and centre). DMSO is used as a control. The 
DiscoveryGate chemoinformatic algorithm revealed structural similarity 
between NSC210627 and brequinar (Supplementary Fig. 5), an inhibitor of 
DHODH. Leflunomide, a structurally distinct DHODH inhibitor, phenocopies 
the crestin phenotype induced by treatment with NSC210627 (a, right). 

b-d, Treatment with leflunomide caused an absence of multiple neural crest 
derivatives, including pigmented melanocytes (b); melanocyte progenitors, 
which were visualized by expressing green fluorescent protein (GFP) under the 
control of the mitfa promoter (c); and glial cells, which were visualized by 
expressing the fluorescent protein mCherry under the control of the myelin 
basic protein (mbp) promoter (d). e, Treatment with leflunomide, or A771726 
(Supplementary Fig. 9a), significantly reduced the number of multipotent 
daughter cells that could be subcloned from individual primary neural crest 
stem cell colonies. Values shown are mean + s.d. of three replicates; *, P< 0.05 
compared with control, Student’s t-test. 


specific for RNA polymerase II (RNA Pol II), followed by sequencing 
(ChIP-seq). Transcriptional elongation was measured using the 
travelling ratio (TR)’’, in which the ratio of RNA Pol II density in 
the promoter-proximal region is compared with that in the gene body. 
In the human melanoma cell lines A375 and Malme-3M, leflunomide 
treatment caused a significant inhibition of transcriptional elongation 
(measured as an increase in the TR), particularly for genes with a TR 
that was initially low (<7.5). For example, in A375 cells, the TR 
increased by >1.3 fold at 21.3% of loci; in Malme-3M cells, this 
occurred for 36.3% of loci (Supplementary Table 5). Examination of 
RNA Pol II occupancy using metagene analysis at a variety of fold- 
change cutoffs (Fig. 3b (A375), Supplementary Fig. 12 (Malme-3M) 
and Supplementary Table 5) revealed no defect in transcription ini- 
tiation but a decrease in elongation that was pronounced at the 3’ end 
of genes such as NPM1 and CCNDI1 (Fig. 3c). Ingenuity Pathway 
Analysis on the loci affected in both cell lines revealed a strong enrich- 
ment for Myc targets and pathway members’” (Supplementary Fig. 
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Figure 3 | DHODH inhibition modulates transcriptional elongation. a, The 
hypomorphic spt5"*”’ homozygous mutant (top right) has only a mild 
pigmentation defect compared with WT (top left) or heterozygous (top centre) 
animals. Treatment with a low dose of leflunomide (3 1M) leads to an almost 
complete absence of neural-crest-derived melanocytes in the mutant line. See 
Supplementary Fig. 11 for dose-response quantification of this effect. 

b, Metagene analysis of RNA Pol II occupancy in A375 human melanoma cells 
after treatment with leflunomide. The metagene plot allows visualization of all 
of the genes that are occupied by RNA Pol IJ, corrected for individual gene 
lengths. Genome-wide RNA Pol II occupancy at the promoter region is 
unaffected but is diminished at the 3’ end of the genes. The inset shows a higher 
magnification of the 3’ region of the genes. c, Representative examples of Myc 
target genes (NPM1 and CCND1), which demonstrate defects in transcriptional 
elongation after treatment with leflunomide. A gene that is minimally affected 
(DGAT) is also depicted. For NPM1, the TR is 5.04 after DMSO treatment and 
8.10 after leflunomide treatment. For CCND1, the TR is 3.47 after DMSO 
treatment and 4.67 after leflunomide treatment. For DGAT, the TR is 5.19 after 
DMSO treatment and 5.34 after leflunomide treatment. 


DGAT ———-— 


13a, b). Myc, in addition to its requirement for neural crest develop- 
ment’*, was recently shown to be a potent regulator of transcriptional 
pause release in embryonic stem cells'®. Our data suggest that the 
mechanism by which Myc target genes are regulated at the transcrip- 
tional elongation level operates in neural-crest-derived melanoma as 
well. Taken together, the genetic and biochemical data demonstrate 
that leflunomide acts to modulate transcriptional elongation in both 
neural crest development and human melanoma. 

Given the effect of DHODH inhibition on neural crest development, 
we tested its effects on melanoma growth. A771726 caused a dose- 
dependent decrease in the proliferation of human melanoma cell lines 
(A375, Hs 294T and RPMI-7951; Fig. 4a). Similarly, a short hairpin 
RNA directed against DHODH led to a 57.7% decrease in the pro- 
liferation of A375 cells, as well as a decrease in elongation as measured 
by ChIP-PCR (Supplementary Fig. 14). Microarray analysis of the 
A375 cell line treated with leflunomide revealed downregulation of 
genes that are required for neural crest development (such as 
SNAI2) and members of the NOTCH pathway (for example, HES6 
and JAGI1), which is consistent with the effects of leflunomide on 
zebrafish embryos (Supplementary Table 6). 

Pyrimidine NTP production is regulated at the level of carbamoyl- 
phosphate synthetase (CAD)"’, the enzyme that is directly upstream of 
DHODH. CAD is phosphorylated by the mitogen-activated protein 
kinase ERK2”, a protein that would be activated in melanoma as a 
result of the BRAF(V600E) mutation. We reasoned that combined 
blockade of BRAF(V600E) and DHODH would suppress melanoma 
growth. We measured melanoma cell proliferation after treatment with 
the BRAF(V600E) inhibitor PLX4720 (ref. 21) together with A771726 
(Fig. 4b, c and Supplementary Fig. 15a, b), and we found that the 
combination led to a cooperative suppression of melanoma growth. 
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Figure 4 | Melanoma growth is suppressed by DHODH blockade in concert 
with BRAF(V600E) inhibition. a, A771726 causes a dose-dependent decrease 
in melanoma cell proliferation, as measured by the CellTiter-Glo assay, in three 
human melanoma cell lines that contain the BRAF(V600E) mutation (A375, 
Hs 294T and RPMI-7951). b, c, A771726 cooperates with the BRAF(V600E) 
inhibitor PLX4720 in inhibiting melanoma cell proliferation in the Hs 294T 
(b) and A375 (c) cell lines, as well as other melanoma cell lines (Supplementary 
Fig. 15). d, After subcutaneous transplantation of 3 X 10° A375 cells into nude 
mice (n = 4-5), both leflunomide (LEF) alone and PLX4720 alone impair 
tumour progression; the combination of these chemicals elicits an almost 
complete abrogation of tumour growth and results in complete tumour 
regression in two of five animals (DMSO versus PLX4720, P = 0.036; DMSO 
versus LEF, P = 0.006; PLX4720 or LEF versus PLX4720 + LEF, P = 0.006; 
PLX4720 versus LEF, P is not significant; analysis of variance, followed by 
Tukey’s post hoc test). a-d, Values shown are mean + s.e.m. of three to five 
replicates. 


PLX4720 had no effect in non-melanoma cell lines (Supplementary 
Fig. 15c, BRAP™?), A771726 demonstrated some antiproliferative 
activity in non-melanoma cell lines but was less potent in these cells 
than in melanoma cells (Supplementary Fig. 15d). 

We examined the in vivo effects of leflunomide and PLX4720 by 
using xenografts of A375 cells transplanted into nude mice (Fig. 4c and 
Supplementary Fig. 16). At 12 days post treatment, tumours in mice 
that had been treated with dimethylsulphoxide (DMSO) as a control 
had grown 7.4+1.3 fold. By contrast, tumours in PLX4720-treated 
mice had grown 5.7+0.16-fold, and those in leflunomide-treated mice 
had grown 4.7+0.12 fold. The combination of PLX4720 and lefluno- 
mide led to an enhanced abrogation of tumour growth, with only 
2.2+0.9 fold growth. In 40% of animals, this combination led to almost 
complete tumour regression (PLX4720 and leflunomide versus 
PLX4720 alone or leflunomide alone, P< 0.001, analysis of variance 
followed by Tukey’s post hoc test). Therefore, we have found that an 
inhibitor of embryonic neural crest development, leflunomide, blocks 
in vivo tumour growth in combination with the BRAF(V600E) inhi- 
bitor PLX4720 when used at clinically meaningful doses. 

Our data suggest that inhibition of DHODH abrogates the tran- 
scriptional elongation of genes that are required for both neural crest 
development and melanoma growth, including Myc target genes and 
mitfa. Although DHODH inhibition would be expected to lead to 
ubiquitous defects, human mutations in DHODH cause Miller’s syn- 
drome”, a craniofacial disorder that is similar to syndromes with 
defects in neural crest development. Our data support recent findings 
that elongation factors are important for both neural crest develop- 
ment” and cancer growth”. Developmental regulators of transcrip- 
tional elongation have recently been identified to be important in 
haematopoiesis*’, and identification of such factors in the neural crest 
awaits further study. 

Using chemical genetic approaches in zebrafish and Xenopus allows 
the identification of molecules that require in vivo contexts for the 
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expression of relevant phenotypes”®. Inhibition of DHODH may be a 
unique in vivo mechanism for modulating transcriptional elongation. 
Leflunomide is a well-tolerated anti-arthritis drug in humans”, and 
our data suggest that, for the treatment of melanoma, it would be most 
effective in combination with a BRAF(V600E) inhibitor. This com- 
bination therapy may help to overcome resistance to BRAF(V600E) 
inhibitors’*. As an increasing number of genomic changes are iden- 
tified in cancerous cells, the challenge is to target these in concert with 
lineage-specific factors to achieve therapeutic synergy. Our approach 
to identifying lineage-specific suppressors in zebrafish embryos can be 
generalized to other cell types, with direct relevance to human cancer. 


METHODS SUMMARY 


Microarray analysis was performed on four groups of embryos at 72 h post fertiliza- 
tion: wild type, T'(mitfa:BRAF(V600E)), p53! ~, and T(mitfa:BRAF(V600E)); 
p53-'— double mutants. Array analysis was also performed for adult 
T¢(mitfa:BRAF(V600E)); p53 ‘~ melanomas and for adjacent skin. The transcrip- 
tional signature of the melanomas was used in gene set enrichment analysis to 
identify genes that were significantly enriched in the Tg(mitfa:BRAF 
(V600E)); p53 '~ embryos. The 123 genes that make up this signature, which is 
enriched for markers of the neural crest, were concordantly upregulated or down- 
regulated in both BRAF(V600E); p53 ‘~ embryos and tumours. In situ hybridiza- 
tion was used to examine the expression of crestin (a pan neural crest marker) 
and that of other neural crest genes in embryos (at 24-72 h post fertilization) and 
adult tumours. Chemical screening was performed to identify suppressors of the 
crestin® lineage by treating wild-type embryos from 50% epiboly to 24h post 
fertilization with various chemicals, followed by robot-controlled in situ hybrid- 
ization. Two inhibitors of DHODH abrogated crestin expression: NSC210627 and 
leflunomide. The latter was used for further studies owing to its more widespread 
availability. The effect of leflunomide on embryonic neural crest development in 
zebrafish was assessed by scoring embryos for melanocytes, iridophores and glial 
cells. Leflunomide was further assessed for its ability to affect the multipotent self- 
renewal of purified p75" o.4-integrin” rat neural crest stem cells (referred to in the 
main text as neural crest stem cells). The effects of leflunomide on transcriptional 
elongation in the neural crest were tested using the spt5’””*” allele and measuring 
pigmentation in response to 3-5 |M leflunomide. Elongation in melanoma cells 
was assayed by ChIP-seq using an antibody specific for RNA Pol II and measuring 
the TR. Leflunomide was tested for anti-melanoma effects on human melanoma 
cell lines in the presence or absence of the BRAF(V600E) inhibitor PLX4720. In 
vitro proliferation assays were performed using the CellTiter-Glo system 
(Promega). In vivo effects were tested by treating established A375 xenografts with 
daily intraperitoneal doses of PLX4720 alone, leflunomide alone or both, and the 
tumour growth rate was measured on days 4, 7 and 12. 
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FAS and NF-«B signalling modulate dependence of 
lung cancers on mutant EGFR 


Trever G. Bivona’, Haley Hieronymus!, Joel Parker?, Kenneth Chang®, Miquel Taron*?, Rafael Rosell*’, Philicia Moonsamy', 
Kimberly Dahlman', Vincent A. Miller®, Carlota Costa*®, Gregory Hannon®’ & Charles L. Sawyers!” 


Human lung adenocarcinomas with activating mutations in EGFR 
(epidermal growth factor receptor) often respond to treatment with 
EGFR tyrosine kinase inhibitors (TKIs), but the magnitude of tumour 
regression is variable and transient’. This heterogeneity in treatment 
response could result from genetic modifiers that regulate the degree 
to which tumour cells are dependent on mutant EGFR. Through a 
pooled RNA interference screen, we show that knockdown of FAS and 
several components of the NF-kB pathway specifically enhanced cell 
death induced by the EGFR TKI erlotinib in EGFR-mutant lung 
cancer cells. Activation of NF-KB through overexpression of c-FLIP 
or IKK (also known as CFLAR and IKBKB, respectively), or silencing 
of IxB (also known as NFKBIA), rescued EGFR-mutant lung cancer 
cells from EGFR TKI treatment. Genetic or pharmacologic inhibition 
of NF-«B enhanced erlotinib-induced apoptosis in erlotinib-sensitive 
and erlotinib-resistant EGFR-mutant lung cancer models. Increased 
expression of the NF-«B inhibitor IkB predicted for improved res- 
ponse and survival in EGFR-mutant lung cancer patients treated with 
EGFR TKI. These data identify NF-KB as a potential companion drug 
target, together with EGFR, in EGFR-mutant lung cancers and pro- 
vide insight into the mechanisms by which tumour cells escape from 
oncogene dependence. 

Despite marked clinical successes achieved with inhibitors of ‘driver’ 
kinases that promote tumour growth, responses are rarely complete 
and also vary in duration**. We hypothesized that the heterogeneity of 
treatment response may result from genetic modifiers that regulate the 
degree to which tumour cells are dependent upon the driver kinase and 
the response to TKI treatment. 

Using lung cancer as a model to identify such modifiers, we con- 
ducted a screen for genes that, when silenced, enhance EGFR depend- 
ence in EGFR-mutant lung cancer cell lines. We intentionally selected 
H1650 cells for the primary screen because they are insensitive to 
EGFR TKIs despite expressing a mutant EGFR with an exon 19 dele- 
tion (ex19del) that predicts for erlotinib sensitivity in patients. H1650 
cells harbour no known EGFR TKI resistance mechanisms other than 
functional PTEN loss, which does not fully account for their insensi- 
tivity’. To identify small hairpin RNAs (shRNAs) that might restore 
dependence on mutant EGFR in H1650 cells, we introduced a pooled, 
shRNA library targeting >2,000 ‘cancer-relevant’ genes® (Supplemen- 
tary Table 1) and treated these cells with vehicle or erlotinib, a standard 
EGFR TKI used in EGFR-mutant lung cancer patients. Hairpins tar- 
geting 36 genes reproducibly conferred erlotinib sensitivity in H1650 
cells across three replicates (threshold 1.5-fold depletion in erlotinib/ 
vehicle, P< 0.1; Supplementary Table 2). 

Among the primary screen hits, 18 of the targeted genes, including 
FAS, could be directly or indirectly linked to NF-«B signalling, which 
is known to have a role in survival signalling (Supplementary Table 2). 
Because recent data demonstrated that FAS and NF-«B signalling can 
promote tumour growth’ °, we proposed that FAS-NF-KB may rescue 


EGFR-mutant tumour cells from EGER inhibition. First, we confirmed 
that shRNAs targeting six of the highest scoring NF-«B pathway- 
associated genes identified in the primary pooled screen effectively 
silenced expression of their targets in H1650 cells (Supplementary 
Fig. 1a). Then we validated their growth inhibitory effects in erlotinib- 
treated H1650 cells using independent siRNAs (Fig. 1a, Supplemen- 
tary Fig. 1b). Importantly, this reduction in cell viability was associated 
with increased caspase 3/7 activity (Fig. 1b), indicating that knock- 
down of these genes promoted erlotinib-induced apoptosis. To address 
directly the role of the NF-KB pathway in EGFR TKI sensitivity, we 
knocked down the major NF-«B subunit RELA (not represented in the 
pooled library) and found that RELA knockdown also induced erloti- 
nib sensitivity in H1650 cells using short term viability and apoptosis 
readouts as well as clonogenic assays (Fig. 1a, b and Supplementary 
Fig. 2). Furthermore, this sensitizing effect was specific to EGFR 
inhibition because stable RELA knockdown did not alter sensitivity 
to cisplatin, paclitaxel or ultraviolet treatment (Supplementary Fig. 2) 
or other TKI (imatinib) (Supplementary Fig. 3). Because c-FLIP and 
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Figure 1 | Mutant EGFR oncogene dependence requires downregulation of 
the FAS-NF-«B pathway. a, Viability (CellTiter-Glo assay) of H1650 cells 
treated with vehicle or 100 nM erlotinib upon introduction of either a non- 
target siRNA pool or gene-specific siRNA pools targeting the genes. Relative 
cell viability is fold change in viability in erlotinib relative to vehicle non-target 
siRNA control (viability decrease >25% = validated). n = 3; mean + s.e.m. 

b, Caspase 3/7 activation (Caspase-Glo assay) in indicated cell lines treated as in 
a. c, d, Viability (CellTiter-Glo assay) of indicated isogenic HBEC cells treated 
as in a. m = 3; mean + s.e.m. 
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RIPK have been implicated as intermediate signalling proteins linking 
FAS to NF-«B”°, we asked if these genes also regulate erlotinib sensi- 
tivity. Indeed, silencing of c-FLIP or RIPK induced erlotinib sensitivity 
in H1650 cells (Fig. 1a, b). The effects of the entire panel of siRNAs 
targeting nine different NF-KB pathway-associated genes were 
observed regardless of PTEN status (Supplementary Fig. 4a, b). 

Next we asked whether the NF-«B pathway genes that scored in 
H1650 cells show similar effects in other EGFR mutant lung cancer cell 
lines. Whereas H1650 cells express EGFR™!°4*! 11-18 human lung 
cancer cells express EGFR(L858R) yet are also relatively insensitive to 
erlotinib without a resistance mechanism validated in patients’’. Each 
of the siRNAs that scored in H1650 cells also scored in cell viability and 
survival assays in 11-18 cells (Supplementary Fig. 4c, d). Silencing of 
the same set of genes also enhanced the growth suppressive effects of 
erlotinib in HCC827 cells (expressing EGER™!°4e!) and in H3255 cells 
(expressing EGFR(L858R)), both of which are relatively more sensitive 
to EGFR TKIs (Supplementary Fig. 4e-h). 

Because unknown alterations in human lung cancer cells could also 
influence erlotinib sensitivity, we used an isogenic system of EGFR- 
transduced human bronchial epithelial cells (HBEC) to test whether 
silencing of these genes cooperates with mutant EGFR to induce onco- 
gene dependence. Consistent with prior data’, human bronchial epi- 
thelial cells (HBEC)-EGFR(L858R) and HBEC-EGER®??*"! cells were 
not sensitive to erlotinib (100 nM). However, silencing of each of the 
nine genes studied in the lung cancer cell lines also induced erlotinib 
sensitivity in HBEC-EGFR™'*“"' (Fig. 1c) and HBEC-EGFR(L858R) 
cells (Supplementary Fig. 5). Induction of erlotinib sensitivity seemed 
equivalent across both the exon19del and L858R EGFR genotypes. The 
erlotinib-sensitizing effect of silencing these genes was specific to 
mutant EGFR because no potentiating effect was seen in wild-type 
HBEC-EGEFR™" cells (Fig. 1d). 

Because FAS knockdown promoted erlotinib-induced apoptosis, we 
measured the activation state of three signalling pathways linked to cell 
survival (AKT, ERK, also known as MAPK1, and NF-«B) to determine 
which, if any, was associated with erlotinib-induced cell death. In erlo- 
tinib-sensitive HCC827 cells, erlotinib treatment alone led to reduced 
levels of phosphorylated pAKT, pERK and pRELA (a measure of NF- 
«B activity), regardless of FAS expression (Supplementary Fig. 6). But in 
erlotinib-resistant H1650 cells, these effects of erlotinib were only 
observed when FAS was silenced (Supplementary Fig. 6). In HBEC- 
EGFR(L858R) cells, erlotinib suppressed AKT and ERK activation 
regardless of FAS expression, but RELA phosphorylation was abolished 
only upon FAS silencing (Supplementary Fig. 6). Together these data 
implicate persistent NF-«B signalling in resistance to erlotinib-induced 
cell death. 

NF-«B signalling was recently shown to be essential for KRAS- 
driven tumour growth’*. Oncogenic KRAS and EGFR may drive tumour 
growth through a common signalling pathway. Because in our studies 
NF-«B knockdown alone was not lethal in EGFR-mutant lung cancer 
cell lines (HCC827, H3255), we compared the effect of silencing NF-KB 
in HBEC-KRAS12V cells versus HBEC-EGFR(L858R) cells directly. 
Knockdown of RELA alone impaired the growth of HBEC-KRAS12V 
but not HBEC-EGFR(L858R) cells (Supplementary Fig. 7), indicating 
that mutant EGFR does not function identically to mutant KRAS in this 
model. 

To extend our findings to in vivo models, we used shRNAs targeting 
FAS, RELA or c-FLIP (distinct from the primary screen) to achieve 
stable knockdown (Supplementary Fig. 8) of the cognate protein in 
erlotinib-resistant H1650 cells. Silencing of each gene by stable shRNA 
enhanced erlotinib sensitivity, with a 1-2log-fold decrease in ICs 
(half maximal inhibitory concentration; Fig. 2a) and induction of 
apoptosis as measured by PARP cleavage (Fig. 2b). Similar results were 
observed in 11-18 and HBEC-EGFR(L858R) cells (Supplementary Fig. 
9). We found that erlotinib treatment induced tumour regression and 
apoptosis only upon knockdown of FAS or RELA (Fig. 2c, d and 
Supplementary Fig. 10a, b) in H1650 human xenograft tumours. 
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Figure 2 | Suppression of the FAS-NF-«B pathway enhances EGFR TKI 
response in EGFR-mutant cells and tumour models. a, Erlotinib dose 
response in H1650 cells expressing a non-target control shRNA or a FAS 
(shRNA2), RELA (shRNA1), or c-FLIP(shRNA1) shRNA (5 [iM erlotinib ICs, 
in parental H1650 cells). Cell viability was assayed as in Fig. 1. n = 3; 

mean + s.e.m. b, Immunoblots showing expression of indicated signalling 
proteins in H1650 cells generated in a. Data represent three independent 
experiments. NT, non-target; PARPy, PARP cleaved. (The decrease in c-FLIP 
protein by shRNA was less complete than the decrease in c-FLIP mRNA level by 
siRNA; Supplementary Fig. 1). c, Effects of stable knockdown of FAS (shRNA3) 
or RELA (shRNA2) on erlotinib sensitivity in H1650 xenograft tumours, 
compared to non-target shRNA control H1650 tumours. Established tumours 
(>200 mm*, n = 10 per treatment group) were randomized and treated for 
7 days with 12.5 mg erlotinib per kg per day, n = 10. Data expressed as mean 
+s.e.m. d, Immunoblots showing expression of indicated proteins in 
representative H1650 tumour xenografts from c analysed at treatment day 7. 


We next asked whether activation of NF-«B might be sufficient to 
confer erlotinib resistance in EGFR-mutant tumours. Canonical NF- 
kB signalling requires downregulation of the NF-«B inhibitor IkB™. 
Therefore we predicted that decreased IkB levels, leading to increased 
NF-«B signalling, might promote EGFR TKI resistance. First, we 
noted that IkB expression was lower in three erlotinib-resistant 
EGFR-mutant lung cancer cell lines that are resistant to erlotinib 
compared with three sensitive cell lines (Fig. 3a and Supplementary 
Fig. 11). Furthermore, we found that knockdown of IkB increased NF- 
«B phosphorylation (Fig. 3b) and conferred partial resistance to erlo- 
tinib in HCC827 cells in vitro (Fig. 3c) and in vivo (Fig. 3d-f). These 
effects were specific to EGFR inhibition because stable knockdown of 
IkB did not protect HCC827 cells from cisplatin, paclitaxel or ultra- 
violet treatment, as measured by clonogenic assays (Supplementary 
Fig. 12). c-FLIP has been implicated as an intermediate signalling 
molecule that activates NF-kB downstream of FAS’®. Ectopic expres- 
sion of c-FLIP induced persistent phosphorylation of RELA and res- 
istance to erlotinib in HCC827 cells (Supplementary Fig. 13). Together 
these data suggest that NF-«B activation can rescue EGFR-mutant 
tumour cells from EGFR TKI 

Acquired resistance to erlotinib in patients can occur through second 
site, drug-resistant mutations in EGFR or through amplification of the 
MET kinase’*’’. These same resistance mechanisms can also evolve in 
vitro with prolonged growth in EGFR TKI'”"®. We therefore asked if 
NF-kB activation occurs naturally during the derivation of erlotinib- 
resistant HCC827 cells through prolonged and continuous EGFR TKI 
treatment. In parental HCC827 cells acute erlotinib treatment (100 nM, 
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Figure 3 | NF-KB activation through IkB downregulation confers EGFR 
TKI resistance in EGFR-mutant lung cancer models. a, Correlation of IkB 
expression with EGFR TKI sensitivity in EGFR-mutant lung cancer cells (IkB 
mRNA expression from Oncomine; sensitive <0.02 [UM ICs9: HCC827, H3255, 
HCC4006; resistant >1 uM ICs9: H1650, H1975, H820). b, Immunoblots 
showing IkB and pRELA expression in lysates from HCC827 treated with non- 
targeting or IxB siRNA pools. c, Viability (CellTiter-Glo assay) of HCC827 cells 
treated with non-target siRNA pool or IxB siRNA pool and either vehicle or 
erlotinib (100 nM). RLU is relative luciferase units (n = 3; mean + s.e.m.). 

d, e, Effects of stable knockdown of IB on erlotinib sensitivity in HCC827 
tumour xenografts compared to non-target shRNA control HCC827 tumours. 
Tumours were established and treated as in Fig. 2c (n = 8 per treatment group, 
mean + s.e.m., 7-day treatment). f, Imnmunoblots showing expression of 
indicated proteins in representative HCC827 tumour xenografts analysed at 
treatment day 7. 


24h) inhibited EGFR, MET, AKT, ERK and RELA phosphorylation. 
Continuous erlotinib treatment resulted in the outgrowth of eight 
resistant HCC827 subclones (over >8 weeks) in which EGFR phos- 
phorylation remained inhibited by erlotinib. Consistent with prior 
data’*, we observed compensatory upregulation of phosphorylated 
MET in four erlotinib-resistant subclones. We also observed increased 
FAS expression in the absence of MET upregulation in four additional 
subclones. Increased RELA phosphorylation and decreased IkB 
expression were observed in three out of four sub-clones in which 
FAS was upregulated (Supplementary Fig. 14a). Knockdown of FAS 
in those subclones with FAS upregulation decreased pRELA and 
enhanced erlotinib sensitivity, but did not result in decreased pAKT 
or pERK (Supplementary Fig. 14b, c). Thus, FAS-mediated EGFR TKI 
resistance is distinct from MET-mediated EGFR TKI resistance, which 
occurs primarily through reactivation of pERK and pAKT in HCC827 
cells'*. Our results are consistent with recent data establishing both 
MEK-dependent”’ and MEK-independent” mechanisms of resistance 
of BRAF-mutant melanomas to the BRAF inhibitor PLX4032. 
NF-«B pathway inhibition can be achieved through inhibition of IkB 
kinase (IKK, also known as IKBKB), the primary kinase that promotes 
IxB instability leading to NF-«B activation’. We therefore asked if 
pharmacologic or genetic inhibition of IKK enhances erlotinib sensi- 
tivity in our erlotinib-resistant lung cancer models. Treatment of 
H1650 (Fig. 4a) or HBEC-EGFR(L858R) cells (Supplementary Fig. 
15a, b) with the IKKB inhibitor BMS-345541 (ref. 21) inhibited 
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Figure 4 | Rationale for combined NF-«B and EGFR inhibition in EGFR- 
mutant lung cancers. a, Dose response in H1650 cells treated with BMS- 
345541 (IKK inhibitor) and additionally either vehicle or erlotinib (100 nM). 
Viability was measured as in Fig. 1 (n = 3, mean + s.e.m.). b, Effects of stable 
knockdown of IKK on erlotinib sensitivity in H1650 tumour xenografts 
compared to non-target shRNA control H1650 tumours. Established tumours 
(>200 mm%, n = 10 per treatment group) were randomized and treated for 

7 days with 12.5 mg erlotinib per kg per day or vehicle. Data are expressed as in 
Fig. 2c (+ s.e.m.). ¢, d, Effects of IkB expression on progression free survival in 
patients with EGFR-mutant lung cancers (c) treated with single agent EGFR 
TKI (n = 52) or (d) chemotherapy and surgery (n = 43). Clinical 
characteristics and responses were defined previously”. Median progression- 
free survival and overall survival for the entire EGFR TKI-treated cohort were 
20 months (95% confidence interval, 13-26.9) and 33 months (95% confidence 
interval, 22.2-43.8), respectively. 


RELA phosphorylation and restored erlotinib sensitivity. Because 
BMS-345541 has suboptimal pharmacokinetic properties we used an 
shRNA to knock down IKK in H1650 cells. The resulting H1650 cells, 
or those transduced with a non-target control shRNA, were then grown 
as xenograft tumours and treated with either vehicle or erlotinib. IKKB 
knockdown resulted in increased levels of IkB and decreased NF-kB 
phosphorylation, as expected. Erlotinib treatment induced tumour 
regression and apoptosis only upon knockdown of IKKf in this system 
(Fig. 4b and Supplementary Fig. 15c, d). 

To determine the clinical relevance of our findings, we examined the 
status of NF-«B activation in a cohort of 52 patients (part of a previ- 
ously characterized cohort”) with EGFR mutant lung cancer who were 
treated with erlotinib and did not have evidence of the T790M muta- 
tion as a potential cause of erlotinib resistance. Because reduced IkB 
levels were associated with erlotinib resistance in cell lines and tumour 
models, we asked if IkB expression correlated with EGFR TKI response 
in patients with EGFR-mutant lung cancers. Low IkB expression 
Chigh-NF-«B’ activation state) was predictive of worse progression- 
free survival (Fig. 4c and Supplementary Table 4a) and decreased 
overall survival (Supplementary Fig. 16a and Supplementary Table 
4b). This finding was specific to erlotinib because IkB expression did 
not predict outcome ina previously characterized cohort” of 43 EGFR- 
mutant lung cancer patients that were not treated with EGFR TKIs but 
instead with chemotherapy and surgery (Fig. 4d and Supplementary 
Fig. 16b). We also observed a significant positive correlation between 
FAS expression and RELA, c-FLIP and IkB levels in the erlotinib- 
treated patients (Supplementary Table 5). Together, our findings sug- 
gest that the extent to which EGFR-mutant tumour cells engage the 
NF-«B pathway may, in part, explain the non-uniform response to 
EGFR TKI treatment observed in EGFR-mutant lung cancer patients 
and provide rationale for testing an IKK inhibitor in combination with 
an EGFR TKI in EGFR-mutant lung cancer patients. 
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METHODS SUMMARY 

Compounds and cell culture. H1650, 11-18, H3255, HCC827 cells were provided 
by W. Pao and were grown in RPMI supplemented with 10% FBS. H1650 PTEN- 
reconstituted cells have been described previously’. HBEC cell lines were provided 
by J. Minna and grown in K-SEM. Erlotinib was from LC Labs. BMS-354541 was 
from Sigma-Aldrich. 

Antibodies and immunoblots. All antibodies were from Cell Signaling 
Technologies and used at 1:1,000 dilution according to manufacturer’s instruc- 
tions. Lysates from cell lines and xenograft tumours were generated using standard 
methods and RIPA buffer and assayed by immunoblots. 

Pooled shRNA screen. H1650 cells were infected with 6,783 shRNAs targeting 
~2,500 cancer-associated genes in a pooled format. Abundance of each hairpin 
within the total population over time was detected by hybridization to custom 
Agilent oligonucleotide microarrays with half-hairpin and barcode probes corres- 
ponding to each shRNA®. The abundance difference for the erlotinib-synthetic 
screen was determined as the average of the log, signal from erlotinib treated cells 
at later time points (day 10 and day 21) minus the average of this log, signal from 
vehicle treated cells at the later time points (day 10 and day 21). Significant hits 
were defined as those altered by 1.5 fold, P<0.1. 

RNA interference assays. Sequences for individual small interfering RNA (siRNA) 
or shRNA (from Sigma or Open Biosystems) are shown in Supplementary Table 3. 
For siRNA experiments, cells were treated in triplicate according to manufacturer’s 
protocol. For shRNA experiments, cells were spin-infected with shRNA virus. 
Where indicated cells were treated with vehicle or erlotinib 24h after RNAi treat- 
ment. Gene silencing was confirmed by mRNA isolation and gene-specific quanti- 
tative RT PCR and by immunoblots. Cell viability or survival was measured using 
CellTiter-Glo and Caspase-Glo, respectively, 24-96 h after drug treatment accord- 
ing to the manufacturer’s instructions and normalized to cells with a non-target 
siRNA pool or shRNA hairpin. 


Received 2 June 2010; accepted 24 January 2011. 


1. Haber, D. A. et a/. Molecular targeted therapy of lung cancer: EGFR mutations and 
response to EGFR inhibitors. Cold Spring Harb. Symp. Quant. Biol. 70, 419-426 
(2005). 

2. Miller, V. A. et al, Molecular characteristics of bronchioloalveolar carcinoma and 
adenocarcinoma, bronchioloalveolar carcinoma subtype, predict response to 
erlotinib. J. Clin. Oncol. 26, 1472-1478 (2008). 

3. Weinstein, |. B. & Joe, A. Oncogene addiction. Cancer Res. 68, 3077-3080 (2008). 

4. Knight, Z.A, Lin, H. & Shokat, K. M. Targeting the cancer kinome through 
polypharmacology. Nature Rev. Cancer 10, 130-137 (2010). 

5. Sos, M. L. etal. PTEN loss contributes to erlotinib resistance in EGFR-mutant lung 
cancer by activation of Akt and EGFR. Cancer Res. 69, 3256-3261 (2009). 

6. Silva, J. M. et al. Profiling essential genes in human mammary cells by multiplex 

RNAi screening. Science 319, 617-620 (2008). 

Peter, M.E. etal. The CD95 receptor: apoptosis revisited. Ce// 129, 447-450 (2007). 

Chen, L. et al. CD95 promotes tumour growth. Nature 465, 492-496 (2010). 

O'Reilly, L. A. et al. Membrane-bound Fas ligand only is essential for Fas-induced 

apoptosis. Nature 461, 659-663 (2009). 


WO ON 


526 | NATURE | VOL 471 | 24 MARCH 2011 


10. Green, D. R. Cancer: a wolf in wolf's clothing. Nature 465, 433 (2010). 

11. Gong, Y. etal. High expression levels of total |GF-1R and sensitivity of NSCLC cells in 

vitro to an anti-IGF-1R antibody (R1507). PLoS ONE 4, e7273 (2009). 

12. Sato, M. et al, Multiple oncogenic changes (K-RAS"!”, p53 knockdown, mutant 

EGFRs, p16 bypass, telomerase) are not sufficient to confer a full malignant 

phenotype on human bronchial epithelial cells. Cancer Res. 66, 2116-2128 

(2006). 

13. Meylan, E. et a. Requirement for NF-«B signalling in a mouse model of lung 

adenocarcinoma. Nature 462, 104-107 (2009). 

14. Luo, J. L., Kamata, H. & Karin, M. IKK/NF-«B signaling: balancing life and death-a 

new approach to cancer therapy. J. Clin. Invest. 115, 2625-2632 (2005). 

15. Pao, W. etal. Acquired resistance of lung adenocarcinomas to gefitinib or erlotinib 

is associated with a second mutation in the EGFR kinase domain. PLoS Med. 2, e73 

(2005). 

6. Bean, J. etal. MET amplification occurs with or without T790M mutations in EGFR 
mutant lung tumors with acquired resistance to gefitinib or erlotinib. Proc. Nat! 
Acad. Sci. USA 104, 20932-20937 (2007). 

17. Engelman, J. A. etal. Allelic dilution obscures detection of a biologically significant 

resistance mutation in EGFR-amplified lung cancer. J. Clin. Invest. 116, 

2695-2706 (2006). 

18. Turke, A. B. et al. Preexistence and clonal selection of MET amplification in EGFR 

mutant NSCLC. Cancer Cell 17, 77-88 (2010). 

19. Johannessen, C. M. et a/. COT drives resistance to RAF inhibition through MAP 
kinase pathway reactivation. Nature 468, 968-972 (2010). 

20. Nazarian, R. et a/. Melanomas acquire resistance to B-RAF(V600E) inhibition by 
RTK or N-RAS upregulation. Nature 468, 973-977 (2010). 

21. Yang, J.,Amiri, K.1., Burke, J.R., Schmid, J. A. & Richmond, A. BMS-345541 targets 
inhibitor of kB kinase and induces apoptosis in melanoma: involvement of nuclear 
factor xB and mitochondria pathways. Clin. Cancer Res. 12, 950-960 (2006). 

22. Rosell, R. et al. Screening for epidermal growth factor receptor mutations in lung 
cancer. N. Engl. J. Med. 361, 958-967 (2009). 

23. Chitale, D. et a/. An integrated genomic analysis of lung cancer reveals loss of 

DUSP4 in EGFR-mutant tumors. Oncogene 28, 2773-2783 (2009). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements We thank W. Pao, H. Varmus and members of the MSKCC Lung 
Cancer Oncogenome group for intellectual support. We thank J. Javier Sanchez for 
analysing human lung cancer gene expression data. We thank J. Wongvipat and 

E. Philips for technical assistance and W. Polkinghorn and members of the Sawyers 
Laboratory for critique of the manuscript. T.G.B. is supported by the MSKCC Clinical 
Scholars Fellowship funded by the Charles A. Dana Foundation, an ASCO YIA, and the 
Caine Halter Lung Cancer Research Fund/Uniting Against Lung Cancer Research 
Grant. C.L.S. and G.H. are Investigators of the Howard Hughes Medical Institute. 


Author Contributions T.G.B. and H.H. designed research, performed experiments, 
analysed data and co-wrote the paper. K.C., P.M., K.D., M.T. and C.C. provided reagents, 
performed experiments and analysed data. J.P. and V.A.M. analysed data. R.R.,G.H. and 
C.LS. designed research, analysed experiments and co-wrote the paper. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of this article at 
www.nature.com/nature. Correspondence and requests for materials should be 
addressed to C.L.S. (sawyersc@mskcc.org). 


©2011 Macmillan Publishers Limited. All rights reserved 


LT aR 


doi:10.1038/nature09990 


A cis-regulatory map of the Drosophila genome 


Nicolas Neégre!*, Christopher D. Brown!*, Lijia Mal, Christopher Aaron Bristow*, Steven W. Miller®*, Ulrich Wagner**, 

Pouya Kheradpour’, Matthew L. Eaton°, Paul Loriaux®, Rachel Sealfon?, Zirong Li*, Haruhiko Ishii*, Rebecca F. Spokony , 

Jia Chen’, Lindsay Hwang", Chao Cheng*:*!°, Richard P. Auburn", Melissa B. Davis', Marc Domanus!, Parantu K. Shah'?, 
Carolyn A. Morrison’, Jennifer Zieba', Sarah Suchy', Lionel Senderowicz!, Alec Victorsen', Nicholas A. Bild', A. Jason Grundstad!, 
David Hanley’, David M. MacAlpine’, Mattias Mannervik!’, Koen Venken"‘, Hugo Bellen'*, Robert White’, Mark Gerstein®?, 
Steven Russell!', Robert L. Grossman’”!°, Bing Ren*!’, James W. Posakony’, Manolis Kellis* & Kevin P. White! 


Systematic annotation of gene regulatory elements is a major chal- 
lenge in genome science. Direct mapping of chromatin modification 
marks and transcriptional factor binding sites genome-wide” has 
successfully identified specific subtypes of regulatory elements’. In 
Drosophila several pioneering studies have provided genome-wide 
identification of Polycomb response elements*, chromatin states’, 
transcription factor binding sites*°, RNA polymerase II regulation® 
and insulator elements'®; however, comprehensive annotation of the 
regulatory genome remains a significant challenge. Here we describe 
results from the modENCODE cis-regulatory annotation project. 
We produced a map of the Drosophila melanogaster regulatory 
genome on the basis of more than 300 chromatin immunoprecipita- 
tion data sets for eight chromatin features, five histone deacetylases 
and thirty-eight site-specific transcription factors at different stages 
of development. Using these data we inferred more than 20,000 
candidate regulatory elements and validated a subset of predictions 
for promoters, enhancers and insulators in vivo. We identified also 
nearly 2,000 genomic regions of dense transcription factor binding 
associated with chromatin activity and accessibility. We discovered 
hundreds of new transcription factor co-binding relationships 
and defined a transcription factor network with over 800 potential 
regulatory relationships. 

To reveal chromatin, promoter and enhancer domains in the 
genome, we performed a developmental time course of six histone 
modifications, the Drosophila CREB binding protein (CBP) and RNA 
polymerase II (PollII) across twelve stages of embryonic, larval, pupal 
and adult development (Supplementary Table 1 and Supplemen- 
tary Figs 1-2; see Supplementary Methods). We used whole animals 
to generate the maximum number of chromatin marks across the 
genome. We identified 506,001 chromatin-associated features covering 
101 megabases (Mb) (86.99%) of the non-repetitive genome. To relate 
these chromatin features to gene activity, we quantified transcript levels 
by high-throughput complementary DNA sequencing (RNA-seq) with 
the same biological samples used for chromatin immunoprecipitation 
(ChIP). Additionally, we mapped 38 functionally diverse transcription 
factors in different developmental stages and cell types. A total of 
155,048 transcription factor binding sites (TFBSs) were identified, 
including 35,096 unique TFBSs. Of these, 93.76% overlap at least one 
chromatin feature. We noted that although the majority of factors are 
bound in discrete regions, several are distributed in larger domains 


(Supplementary Table 1 and Supplementary Fig. 3). We also charac- 
terized the binding distributions of five Histone deacetylases 
(HDACs), identifying a total of 19,937 HDAC binding sites mapping 
to 7,692 unique genomic locations. Of these, 99.25% overlap with at 
least one chromatin feature, and 94.58% overlap with at least one TFBS. 
All data from this study are available at http://www.modencode.org 
and http://www.cistrack.org. 

For each chromatin mark, very few target genes showed either 
repressive or activating marks across all of development; most genes 
were within dynamically marked regions (Supplementary Fig. 4a). We 
observed three major patterns of chromatin mark distributions, cor- 
responding to active promoters (H3K4me3, H3K9ac, H3K27ac), 
repressive states and silencers (H3K27me3, PHO/Polycomb response 
elements (PREs)), and enhancers (CBP, H3K4me1) (Supplementary 
Fig. 4b). 

The first pattern, represented by the activating histone modifica- 
tions H3K4me3, H3K9ac and H3K27ac, was strongly associated with 
transcription start sites (TSSs) and was positively correlated with gene 
expression levels’’ (Supplementary Figs 4b, 5a, 7). Although the 
enrichment of activating histone modifications was quite marked, 
we note that a substantial fraction of genes (34%) were expressed 
but lacked H3K4me3 marks at their annotated TSS (Supplementary 
Figs 8-12). Regions marked by each activating modification also sig- 
nificantly overlapped with class I insulators, PollI binding sites, and a 
large fraction of TFBSs (Supplementary Fig. 1). 

The second type of pattern, repressive chromatin marks H3K9me3 
and H3K27me3, showed a distribution of large domains throughout 
development (Supplementary Figs 2, 6, 13). As expected on the basis of 
polytene chromosomes in situ data, H3K9me3 marks localized to ~20 
developmentally stable domains primarily at centromeres’*. H3K27me3 
marks, in contrast, were remarkably dynamic (Fig. la). Dynamic 
domains may be due to changes in specific cell populations during 
development or the active addition and removal of H3K27me3 marks. 
Previous studies have implicated H3K27me3 dynamics in the regulation 
of homeotic genes, in the differentiation of stem cells, and in develop- 
mental processes in vertebrates'*. We found between 123 and 438 dis- 
crete domains present at the developmental stages assayed, each with an 
average length of ~70 kb (Supplementary Fig. 13a and Supplementary 
Table 2). One-thousand two-hundred and sixty-four genes were asso- 
ciated with H3K27me3 in at least one stage of development, with 397 
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Figure 1 | Chromatin dynamics across Drosophila development. 

a, Enrichment of CBP and H3K4mel (rows) within regions marked by other 
chromatin modifications, factors, or annotated enhancers (columns). Note that 
(1) CBP is enriched within all active marks (H3K4me3, H3K27ac, H3K9ac, 
H3K4mel and PollI) at all stages of development, and (2) early embryo (0- 
16h) CBP- and H3K4mel-marked regions are enriched within H3K27me3 
domains and annotated enhancers (right panel). b, Heatmap depicting fold 
enrichment of CBP-bound regions (columns) at different developmental stages 
for each of the 22 clusters of TSS-distal regions (rows) grouped by their protein 
binding profiles. A subset of the clusters shows significant enrichment for CBP 
at different developmental stages. c, Enrichment of enhancer categories 


genes (31%) in domains present in all stages of development and 867 
genes (69%) in dynamic domains (Fig. la). Stable H3K27me3 domains 
corresponded to those reported in embryos and tissue culture cells’, and 
were enriched for genes involved in development, transcription and 
segmentation. However, identification of stage-specific H3K27me3 
domains revealed previously unappreciated H3K27me3 targets, including 
genes that control apoptosis, regulation of growth and neurotransmitter 
transport (Supplementary Fig. 14). We found that stable H3K27me3 
domains are highly enriched for genes that exhibit stage- and tissue- 
specific expression, and are depleted for ubiquitously expressed genes 
(Supplementary Fig. 15). 

H3K4mel-marked and CBP/p300-bound regions form a third, 
intermediate class of genomic elements known to be associated with 
active enhancers*"* (Fig. 1a). They were also associated with active 
promoters, activating histone marks and transcription factor binding 
sites (Supplementary Fig. 1). H3K4mel and CBP were bound broadly 
across TSSs, typically positioned 1-2 kb upstream and downstream of 
the TSS, consistent with previous observations'' (Supplementary Figs 
2, 4b). Accordingly, these patterns were very dynamic across develop- 
ment (Supplementary Fig. 4a). 

To characterize the regulators of chromatin mark dynamics, we 
determined the genome-wide distribution pattern of all five known 
Drosophila HDACs (HDAC1 (also known as RPD3), HDAC3, 
HDAC4, HDAC6 and HDACX (also known as HDAC11)). All five 
HDACsare enriched at active promoters, and enrichment is correlated 
with target gene expression level (Supplementary Fig. 16). HDAC4 and 
HDACI1/RPD3 binding sites also mark PREs. HDAC] and HDAC4 
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(columns) for each of the 22 clusters of TSS-distal regions (rows). Many clusters 
enriched for CBP binding in early development are also strongly enriched for 
enhancers (rows marked with an asterisk). d, Embryo-specific CBP binding 
predicts unannotated enhancers. RNA in situ hybridizations with a Gal4 probe 
were used to stain transgenic embryos representing five different enhancer 
predictions (rows), at four to five different stages (columns). E0044 overlaps 
the known expression pattern for the neighbouring gene, CG8745 (FlyExpress 
Database). e, Enrichment of enhancer annotations (rows) within the binding 
sites of each transcription factor (columns). For panels a, b, c and e grey boxes 
indicate no overlap. For panels a, b, cand e all values greater or less than zero are 
significant, false discovery rate (FDR) < 0.01. 


binding sites are frequently found within H3K27me3 repressive 
domains (Supplementary Figs 17, 18), and are significantly enriched 
at embryonic PHO (a PcG recruiter protein)-bound regions (Sup- 
plementary Fig. 16f). Of the 537 HDACI1 and 4a binding sites that 
overlap H3K27me3 but not H3K4me3 (Supplementary Fig. 16), 149 
overlap with 350 previously described'* embryonic PHO sites (Sup- 
plementary Figs 17, 18). HDAC3 is primarily associated with tran- 
scribed, H3K36me3 marked exons'* (Supplementary Fig. 16a, d). 

Using the dynamic chromatin signatures and RNA-seq data, we 
sought next to systematically annotate cis-regulatory elements. To 
identify novel promoters, we identified coincident H3K4me3, Polll 
and RNA signals at least 1,000 base pairs away from any annotated 
TSS (see Supplementary Methods). In each developmental stage we 
found several hundred such regions (average, 485; range, 179-885), 
resulting in a total of 2,307 novel promoter predictions; 1,117 of which 
are supported by modENCODE cap analysis of gene expression 
(CAGE) data from embryos’” (Supplementary Fig. 5a). We subjected 
110 novel promoter predictions to biological validation using a luciferase 
reporter assay in Kc167 cells. Seventy-five of these 110 predicted pro- 
moters (68%) yielded significant luciferase activity in at least one ori- 
entation, with 26 displaying bi-directionality (Supplementary Fig. 5b and 
Supplementary Table 3). Together, the CAGE and reporter assay data 
indicate that a high proportion of these novel promoter predictions 
indeed correspond to previously unannotated TSSs. 

To identify additional putative cis-regulatory elements on a genome- 
wide scale, we examined two signatures of enhancers, H3K4mel and 
CBP/p300''"*. CBP and H3K4mel are significantly enriched within 
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several classes of known enhancers from the CRM activity database 
(CAD). For example, we found a 15-fold (z-score of 26) and 5.9-fold 
(z-score of 10) enrichment for CBP and H3K4mel overlap, respectively, 
with blastoderm-specific enhancers, indicating that our dynamic chro- 
matin map successfully recovers previously annotated enhancers 
(Supplementary Fig. 19). Given that CBP can be recruited to enhancers 
by bound transcription factors, we sought to further support the func- 
tional relevance of CBP-bound regions by examining clusters of co- 
occupancy with other transcription factors. Several CBP clusters are 
bound by transcription factors known to interact physically with CBP, 
such as Bicoid, Dorsal (DL) and Trithorax-like (TRL (also known as 
GAF)); whereas other clusters are enriched for known enhancers 
(Fig. le) and are strongly enriched in K3K4mel and the repressive mark 
H3K27me3 (Supplementary Fig. 20). In total, 14,450 distinct putative 
CBP-bound cis-regulatory elements were identified across the genome 
(Supplementary Table 18). 

To validate the ability of CBP binding data to accurately identify cis- 
regulatory elements, we tested 33 putative enhancer sequences using 
reporter assays in transgenic Drosophila. We focused on putative 
enhancers that have dynamic CBP association during embryogenesis 
(Supplementary Table 4). Thirty of the 33 predicted enhancers produce 
specific reporter expression patterns (Fig. 1d and Supplementary Fig. 21). 
Wealso selected a set of putative insulator elements”, and we tested their 
activity in an enhancer-blocking assay based on the eve stripe 2 and 3 
enhancers. We assayed a set of 15 genomic fragments associated with the 
binding of Centrosomal protein 190 kDa (CP190) + CTCF (class I), 
CP190 + suppressor of Hairy wing (Su(Hw)) (class II) and TRL”. We 
found that five of eight CTCF sites showed strong enhancer-blocking 
activity and the remaining three showed weak or variable activity. In 
contrast, neither of the TRL sites nor any of the five Su(Hw) sites we 
tested blocked enhancer-promoter interactions in this assay (Sup- 
plementary Fig. 22). These results support a role for CTCF in insulator 
activity in vivo, but indicate that other proteins that have classically been 
associated with insulator activity are not strictly linked to this function. 

To further annotate predicted enhancers and to determine whether 
dynamics of chromatin and gene expression from whole animals can be 
associated with specific factors, we analysed the patters of 38 diverse 
transcription factors we mapped at various developmental stages. We 
compared our data with the CAD database (Fig. le) and observed that 
many factors are specifically enriched in particular enhancer classes. For 
example, Engrailed (EN) binding sites are enriched within mesothoracic 
disc enhancers, whereas Knirps (KND), Tailless and Schnurri (SHN) are 
enriched within blastoderm enhancers. Indeed, enhancers are usually 
characterized by multiple transcription factors binding in concert to 
target genomic DNA. We used a Gaussian kernel density estimation 
across the binding profiles of 38 transcription factors mapped in early 
embryos in this and two previously published studies”’ to define a 
‘transcription factor complexity’ score based on the number and 
proximity of contributing transcription factors (see Supplementary 
Methods). Of 38,562 unique binding sites mapped by the 38 tran- 
scription factors, 38.3% are bound by more than two factors. Of the 
unique binding sites, 5.2% (1,962) are bound by more than eight 
factors (Supplementary Table 5 and Supplementary Figs 23, 24) and 
are considered high occupancy target (HOT) regions. Although HOT 
regions have been observed in Caenorhabditis elegans'* and human 
(ENCODE project, unpublished results), their function in gene regu- 
lation is unknown”. 

Regions of higher complexity are weakly associated with more highly 
expressed genes (r” = 0.19), indicating that low-complexity binding sites 
are associated with more restricted expression patterns. Interestingly, 
annotated enhancers, CBP, activating histone marks including 
H3K4mel, and HDACI, 4, 6 and 11 are most significantly enriched 
within low-to moderate- complexity category (CC) regions (CC2-CC8) 
(Fig. 2b). These enrichments consistently decrease at regions of high 
complexity (CC8-16). In contrast, we found that coding exons and 
HDAC3, which marks actively transcribed exons!® (Fig. 2b and 
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Supplementary Fig. 24), are depleted from moderate-to-high-complexity 
regions (>CC4). As expected, transcription factor complexity is inversely 
correlated with nucleosome enrichment” (Fig. 2b). Interestingly, when 
compared to our enhancer validations and negative controls that were 
selected independent of HOT region determination, there seems to be 
no obvious relationship between enhancer activity and HOT regions; 
13 validated enhancers overlap with HOT regions but so did several 
sequences that give no enhancer activity (Fig. 1d, Supplementary Table 
4 and data not shown). Taken together, these results indicate that HOT 
regions are primarily associated with open chromatin but that they do 
not always demarcate cis-regulatory elements. 

The existence of HOT regions complicates the interpretation of tran- 
scription factor co-occurance. For example, pair-wise clustering of TFBSs 
resulted in very large groups of co-occurring transcription factors, reveal- 
ing few specific relationships (Supplementary Fig. 23). However, TFBS 
clustering performed on HOT-subtracted TFBSs reveals structure that is 
otherwise obscured when HOT regions are included (Fig. 3). For 
example, binding sites from different stages assayed for the same tran- 
scription factor (for example, TRL, Ultrabithorax (UBX), Ecdysone 
receptor (ECR)), known interactors (for example, Tinman (TIN) with 
Twist (TWI) and Biniou (BIN) with Bagpipe (BAP)), and technical 
replicates (for example, GRO) are more tightly clustered in the HOT- 
subtracted data. Transcription factors known to physically interact with 
one another at specific enhancers were significantly associated through- 
out the genome. For example, the co-repressor complex of GRO and EN 
and the Drosophila SWI/SNF chromatin remodelling complex com- 
ponents Brahma (BRM) and Snf5-related 1 (SNR1) show significant 
co-binding (z > 20). Co-binding enrichment genome-wide was also 
observed for transcription factors that are known to bind independently 
to particular enhancers, such as UBX and EN that each bind to the DMX 
enhancer of the distalless (dll) gene, and each independently contribute to 
dll repression in different embryonic segments”'. DLL was itself enriched 
for co-binding with EN, GRO and UBX, indicating common regulation 
of target genes. Interestingly, such previously undescribed interactions 
were seen at significance levels equal to or greater than those of known 
interactions. For example, while the previously reported mesodermal 
transcription factor data set? (TIN, TWI, BIN, BAM, MEF2) all have 
high overlap with one another as expected, these factors also all show 
highly significant overlap with GRO, CAD and EN. Many other notable 
overlap pairs were identified, including the Ecdysone receptor with TRL, 
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Figure 2 | Transcription factor binding site complexity. a, Number of TFBSs 
(left y-axis, black circles) and distribution of genomic annotation classes (right 
y-axis, colours) as a function of TFBS complexity (x-axis). b, Enrichment 
(colour scale: depleted in blue, enriched in red) of TFBSs sorted by complexity 
(x-axis) within annotated enhancers (CM, cardiac mesoderm; Ht, heart muscle; 
SM, somatic muscle; Meso., mesoderm; VM, visceral muscle), HDAC binding 
sites, early embryo chromatin marks. At the top is a heatmap depicting 
nucleosome density (Nuc. dens.) as a function of TFBS complexity. 
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the peripheral nervous system master regulator Senseless (SENS)” with 
the axon guidance transcription factor Disconnected, and the Jak/Stat 
signalling pathway transcription factor STAT92E with the chromatin 
remodelling complex factors BRM and SNR1—all potential new con- 
nections between well-studied regulatory pathways or mechanisms. In 
total there are 831 highly significant positive pair-wise co-binding 
interactions in Supplementary Fig. 25 (z-score >20; bright red in 
Supplementary Fig. 25c), most of which are previously undescribed. 

Although most significantly associated transcription factor pairs did 
show positive overlaps, we observed a few instances of highly significant 
negative associations (shown in blue, Fig. 3). One of the most anti- 
correlated pairs of transcription factors is Brakeless (BKS) and CAD. 
BKS is a co-repressor that has been implicated in gap gene regulation, 
for example acting to restrict the expression of knirps (kni) and giant 
(gt) in the posterior blastoderm”’. In contrast, CAD activates kni and gt 
in the same embryonic domain™*. Even when BKS and CAD have 
multiple binding sites nearby one another, they appear to be nonover- 
lapping and in different putative cis-regulatory elements (Supplemen- 
tary Fig. 26). The biologically opposing roles of these two transcription 
factors seem to have led to the evolution of a very strong repulsion 
for occupying the same regulatory elements. To our knowledge, this 
genome-wide aversion in terms of transcription factor co-occupancy 
has not previously been observed in a metazoan genome. 

To visualize the regulatory interactions among transcription factors, 
we built an intuitive hierarchy representing transcription factor regu- 
latory associations (Supplementary Figs 25, 27; see Supplementary 
Methods). This network was constructed using 61 transcription factor 
data sets generated by the modENCODE project (pink nodes) and 20 
transcription factors from recently published work®’’ (green and 
yellow nodes). Specifically, we built a core hierarchy using a breadth- 
first search algorithm in a bottom-up fashion. Transcription factors 


530 | NATURE | VOL 471 | 24 MARCH 2011 


Figure 3 | Transcription factor 
binding site overlap. Pairwise TFBS 
enrichment/depletion (colour-coded 
by z-score). TFBS data sets labelled at 
right. Asterisks indicate data from 
the BDTNP consortium®”’; hashes 
denote data from the Furlong 
laboratory”. Selected interactions 
described in the text are highlighted. 
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that regulate fewer than five transcription factors formed the bottom 
layer whereas transcription factors that directly regulated the bottom 
layer factors form the second layer. In total, the network model 
characterized 835 interactions; 686 were established by transcription 
factors mapped in this study (blue edges), 125 were derived from 
previously published data (grey edges), and 24 were auto-regulatory®”” 
(Supplementary Fig. 25). Components of the network derived from 
modENCODE-mapped transcription factors capture many known 
regulatory interactions; for example, EVE regulates ftz and prd. 
However, the vast majority of the 686 transcription factor interactions 
represent new putative regulatory relationships. 

Transcription factors involved in widespread target co-binding and 
feed-forward loops are also likely to be involved in regulating common 
patterns of expression. To understand better how combinatorial tran- 
scription factor binding regulates developmentally dynamic gene 
expression, we analysed gene expression data from our RNA-seq time 
course and an independently performed 64-stage-developmental micro- 
array expression time course. We partitioned the expression data sets 
into 18 and 64 k-means clusters, respectively, which resulted in gene sets 
with widely varying temporal specificity (Supplementary Fig. 25b). For 
each cluster of genes, we then quantified the enrichment of promoter- 
proximal binding sites for 90 modENCODE and previously published 
transcription factor data sets. From the microarray time-course cluster- 
ing, five metaclusters were identified. Genes within these metaclusters 
are most highly expressed at third instar to adulthood (1), first instar to 
pupal-adult ecdysone pulse (II), early embryos (III), embryogenesis and 
larval life (IV) and late embryos (V). In both the microarray and RNA- 
seq time courses, most clusters are significantly associated with a core set 
of transcription factors including SIN3A, UBX, CAD, SENS and TRL. 
Interestingly, all metaclusters are enriched for TRL binding sites except 
V, which is enriched for SNR1, another Trithorax group gene; consistent 
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with reports that SNR1 has specialized functions**. Metacluster Il is most 
highly expressed during adult central nervous system development” and 
enriched for several neuronal differentiation factors (Kruppel, KNI and 
Jumeau)”’. Metacluster III uniquely is associated with embryonic 
patterning and organogenesis transcription factors (for example, 
Runt, Hunchback, TWI). Notably, many of the transcription factor 
co-enrichments within gene expression clusters correspond to bind- 
ing site and regulatory co-enrichments (Fig. 3 and Supplementary Fig. 
25), indicating that many of the co-associations of transcription 
factors with developmental expression patterns reflect co-binding 
and coordinate regulation at target sites in the genome. 

In summary, we generated a draft regulatory annotation map of the 
Drosophila genome from 313 genome-wide data sets that identify or 
predict thousands of regulatory elements, including 537 silencers, 
2,307 newly annotated promoters, 14,450 candidate CBP-bound cis- 
regulatory elements, 7,685 putative insulators’? and 35,000 unique 
TBFSs that were bound by one or more transcription factor (Sup- 
plementary Tables 6-16 and Supplementary Fig. 28). The transcription 
factor binding results defined HOT regions of increased transcription 
factor complexity and their association with HDACs and open chro- 
matin. Subsequent analysis of significantly co-bound transcription 
factors and transcription factor networks with HOT-subtracted data 
greatly expands the existing view of regulatory interactions among 
transcription factors and associates specific sets of transcription factors 
with specific developmental gene expression patterns. Several unex- 
pected results arose from this initial phase of the modENCODE 
Project. For example, we revealed a specific class of unmarked promo- 
ters, identified a surprising association of HDAC4 and HDACI1/RPD3 
to PREs, and discovered pairs of transcription factors that systematic- 
ally avoid binding near each other throughout the genome. We expect 
that the results from modENCODE will serve as launch points for 
many new investigations, and that additional novel insights about 
the functional consequences of the patterns we describe here will 
emerge as others in the community engage with these data. 


METHODS SUMMARY 


ChIP experiments were performed on whole Drosophila melanogaster animals 
from the following developmental stages: embryonic stages 0-4h, 4-8h, 12- 
16h, 16-20h, 20-24h, larval stages L1, L2 and L3, pupal stage and adult male. 
The biological material was homogenized in 1.8% of formaldehyde. The cross- 
linked chromatin was sonicated to an average size of 500 bp. Pre-cleared chro- 
matin extract was incubated overnight at 4°C with the specific antibody and 
immunoprecipitated. ChIP material was hybridized either on custom Agilent 
tiling microarrays or on Affymetrix Tiling arrays. For ChIP-seq, standard proto- 
cols for Illumina Genome Analysers were used. The software packages used for 
peak detection were MACS, Peakseq, HGGSEG, CisGenome, MAT and HMMseg 
where appropriate. For RNA-seq experiments, total RNA was extracted from the 
same material used for ChIP and processed according to Illumina standard pro- 
tocols. All methods and scripts used for the analysis of the data are described in 
Supplementary Information and are available on request. Transgenic assays for 
promoters, insulators and enhancers are described in Supplementary Information. 
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TERRA and hnRNPAI orchestrate an RPA-to-POTI1 
switch on telomeric single-stranded DNA 


Rachel Litman Flynn’, Richard C. Centore!*, Roderick J. O’Sullivan**, Rekha Rai**, Alice Tse’, Zhou Songyang”, Sandy Chang", 


Jan Karlseder® & Lee Zou!” 


Maintenance of telomeres requires both DNA replication and telo- 
mere ‘capping’ by shelterin. These two processes use two single- 
stranded DNA (ssDNA)-binding proteins, replication protein A 
(RPA) and protection of telomeres 1 (POT1). Although RPA and 
POT1 each have a critical role at telomeres, how they function in 
concert is not clear. POT1 ablation leads to activation of the ataxia 
telangiectasia and Rad3-related (ATR) checkpoint kinase at telo- 
meres’”, suggesting that POT1 antagonizes RPA binding to telo- 
meric ssDNA. Unexpectedly, we found that purified POT1 and its 
functional partner TPP1 are unable to prevent RPA binding to 
telomeric ssDNA efficiently. In cell extracts, we identified a novel 
activity that specifically displaces RPA, but not POT1, from telo- 
meric ssDNA. Using purified protein, here we show that the hetero- 
geneous nuclear ribonucleoprotein Al (hnRNPA1) recapitulates the 
RPA displacing activity. The RPA displacing activity is inhibited by 
the telomeric repeat-containing RNA (TERRA) in early S phase, but 
is then unleashed in late S phase when TERRA levels decline at 
telomeres’. Interestingly, TERRA also promotes POT1 binding to 
telomeric ssDNA by removing hnRNPA1, suggesting that the re- 
accumulation of TERRA after S phase helps to complete the RPA-to- 
POT! switch on telomeric ssDNA. Together, our data suggest that 
hnRNPA1, TERRA and POT! act in concert to displace RPA from 
telomeric ssDNA after DNA replication, and promote telomere cap- 
ping to preserve genomic integrity. 

RPA binds ssDNA in a non-sequence specific manner*, whereas 
POT1 specifically recognizes ssDNA consisting of the telomeric repeats’. 
RPA plays a key role in DNA replication and activation of the ATR 
checkpoint’, and POT1 suppresses ATR activation at telomeres’* 
(Supplementary Fig. 1). In both yeast and humans, RPA associates with 
telomeres during S phase of the cell cycle”, and is implicated in telomere 
maintenance'®'’, Furthermore, ATR transiently associates with telo- 
meres and suppresses telomere instability””°’*. These findings raise the 
question of how the bindings of POT1 and RPA to telomeric ssDNA are 
orchestrated and, furthermore, how the interplay between POT1 and 
RPA affects DNA replication and ATR activation at telomeres. 

Double-stranded DNA (dsDNA) with ssDNA overhangs activates 
ATR in human cell extracts’*. To investigate how ATR activation is 
suppressed at telomeres, we tested whether telomeric ssDNA over- 
hangs affect ATR activation in this assay. Resected dsDNA of random 
sequences, but not resected telomeric dsDNA, efficiently induced the 
phosphorylation of RPA2 by ATR (Supplementary Fig. 2)’, suggesting 
that telomeric ssDNA overhangs do not support efficient ATR activa- 
tion in cell extracts. 

The absence of ATR activation by telomeric ssDNA suggests that 
POT1 may prevent RPA binding to telomeric ssDNA’. POT1 and 
TPP1 function as heterodimers in cells, and the complex binds to 
telomeric ssDNA more efficiently than POT1 alone’*'*. In gel-shift 
assays, the POT1-TPP1 complexes purified from insect or human cells 


and the RPA purified from Escherichia coli efficiently bound to a 
telomeric ssDNA probe (Fig. la and Supplementary Fig. 3a, b). 
POT1-TPP1 exhibited lower affinity for telomeric ssDNA than RPA 
(Supplementary Fig. 3a). When POT1-TPP1 and RPA were co-incu- 
bated with the probe, the RPA~ssDNA complex was readily detected, 
whereas no POT1-containing complexes were observed (Fig. la and 
Supplementary Fig. 3b). In pull-down assays using biotinylated telo- 
meric ssDNA (ssTEL), RPA also outcompeted POT1-TPP1 for bind- 
ing to ssTEL (Fig. 1b and Supplementary Fig. 3c). Thus, RPA, which is 
more abundant than POT1-TPPI1 in cells*'’, outcompetes POT1- 
TPP1 for binding to telomeric ssDNA when present at similar con- 
centrations as POT1-TPP1. The E. coli ssDNA-binding protein only 
modestly reduced POT1 binding to ssTEL (Supplementary Fig. 3c), 
suggesting that the ability to outcompete POT1-TPP1 is unique to 
RPA. 

The ability of RPA to outcompete POT1-TPP1 raises the question 
of how ATR activation is suppressed in cell extracts. Purified RPA 
bound to ssTEL and mutated telomeric repeats (ssMUT) efficiently 
(Fig. 1c). In stark contrast to purified RPA, the endogenous RPA in 
HeLa whole-cell extracts (WCEs) was largely excluded from ssTEL; 
however, it still associated with ssMUT (Fig. 1c). The sequence-specific 
exclusion of RPA from ssTEL in WCEs suggests that RPA may be 
outcompeted by other proteins or actively displaced from telomeric 
ssDNA. 

To assess if RPA is actively displaced from ssTEL, we pre-coated 
ssTEL and ssMUT with RPA then incubated them in extracts. The 
levels of RPA on ssTEL gradually declined with increasing concentra- 
tions of WCEs from HeLa, HEK293E, U2OS and MEF cells (Fig. 1d 
and Supplementary Fig. 4a). In addition, HeLa nuclear extracts, but 
not the cytoplasmic extracts, efficiently displaced RPA from ssTEL 
(Supplementary Fig. 4b). In marked contrast to the RPA on ssTEL, 
the RPA bound to ssMUT remained constant regardless of WCE 
concentrations (Fig. 1d). When POT1-coated ssTEL was incubated 
in extracts, POT1 remained stably bound to ssTEL even in high con- 
centrations of WCEs (Fig. le). Furthermore, RPA was rapidly dis- 
placed from ssTEL within 5 min, whereas no POT1 was displaced 
after 60 min (Supplementary Fig. 4c). Thus, the activity that displaces 
RPA from telomeric ssDNA is sequence-specific, protein-specific and 
localized within the nucleus. 

The specific displacement of RPA, but not POT1, from telomeric 
ssDNA prompted us to test if POT1 is the RPA displacing factor. 
When incubated with RPA-ssTEL, POT1-TPP1 did not significantly 
reduce the levels of ssTEL-bound RPA (Supplementary Fig. 5a). To 
identify the RPA displacing factors, we sought to capture the RPA 
displacing activity from extracts using RPA-ssTEL as bait. The proteins 
captured and eluted from RPA-ssTEL, but not RPA-ssMUT, recapitu- 
lated the RPA displacing activity (Fig. 2a). Mass spectrometry analysis of 
the proteins specifically captured by RPA-ssTEL identified hnRNPA1 
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Figure 1 | A novel telomere-specific RPA displacing activity in human cell 
extracts. a, POT1-TPP1 (60 nM; purified from insect cells), RPA (60 nM) and 
mixtures of POT1-TPP1 and RPA (60, 120, 180nM of POT1-TPP1 mixed 
with 60 nM of RPA) were incubated with 20 nM of the ssDNA probe and 
analysed by gel-shift. b, POT1-TPP1 (2.4nM), RPA (2.4nM) and mixtures of 
POT1-TPP1 and RPA (2.4, 4.8, 7.2nM of POT1-TPP1 mixed with 2.4nM of 
RPA) were incubated with 0.8 nM of biotinylated ssTEL ((TTAGGG)g). The 


and hnRNPA2/B1 (Supplementary Fig. 5b, c), both of which are known 
to bind telomeric ssDNA‘. The presence of hnRNPAI and A2/B1 in 
the eluted fraction with RPA displacing activity was confirmed by 
western blot (Fig. 2b). Moreover, hnRNPA1 and A2/B1 gradually 
bound to ssTEL as RPA was displaced in WCEs (Fig. 2c). These results 
suggest that hnRNPA1I and A2/B1 may playa role in RPA displacement. 

hnRNPA1 has been implicated in telomere maintenance”. 
Extracts from hnRNPA1 knockdown cells exhibited reduced activity 
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proteins bound to ssTEL were retrieved by streptavidin beads and analysed by 
western blot. c, Biotinylated ssTEL or ssMUT ((TTTGCG)g) was incubated 
with WCEs or recombinant RPA (rRPA). d, ssTEL or ssMUT pre-coated with 
RPA was incubated with increasing concentrations of HeLa WCEs (0.08, 0.19, 
0.36, 0.8, 1.3 pg pl). The RPA2 remaining on ssTEL was analysed as in 

b. e, ss TEL pre-coated with POT1 was incubated with increasing 
concentrations of HeLa WCEs (0.07, 0.18, 0.33, 0.66, 1.3 ug ul’). 


in RPA displacement (Supplementary Fig. 6a). Purified hnRNPA1 
efficiently displaces RPA from ssTEL, but not ssMUT (Fig. 2d). 
Furthermore, hnRNPA1 did not displace POT1 from ssTEL 
(Fig. 2e). hnRNPA1 only displaces RPA from ssTEL containing four 
or more telomeric repeats (Supplementary Fig. 6b), indicating that a 
DNA length-dependent binding mode of hnRNPA1 may be needed to 
displace RPA’. Given that hnRNPA1 and A2/B1 are highly homolog- 
ous in the RRM domains that bind telomeric ssDNA, both of these 
hnRNPs may contribute to RPA displacement. 

hnRNPAI not only binds telomeric ssDNA but also TERRA”**. To 
test if TERRA affects the ability of hnRNPA1I to displace RPA from 
ssTEL, we added increasing concentrations of TERRA or control RNA 
to nuclear extracts, then incubated the extracts with RPA-ssTEL. RPA 
displacement was virtually abolished by TERRA, but not control RNA 
(Fig. 3a). The RPA displacing activity captured by RPA-ssTEL was also 
specifically inhibited by TERRA (Fig. 3b). Furthermore, the ability of 
purified hnRNPA1 to bind ssTEL and to displace RPA from ssTEL was 
specifically inhibited by TERRA (Fig. 3c and Supplementary Fig. 6c). 
Thus, TERRA is a potent inhibitor of the RPA displacing activity of 
hnRNPAI. 


Figure 2 | RPA displacement by hnRNPAL. a, ssTEL and ssMUT pre-coated 
with RPA were incubated with nuclear extracts (NE). After the incubation, the 
proteins bound to DNA were retrieved, eluted and applied to RPA-coated 
ssTEL or ssMUT (see Supplementary Methods). After the second incubation, 
the remaining RPA2 on DNA was analysed by western blot. b, Proteins 
captured by RPA-ssTEL or RPA-ssMUT and eluted by salt were analysed by 
western blot using antibodies to hnhRNPA1, hnRNPA2/B1 and TRE2. c, RPA- 
coated ssTEL (0.8 nM) was incubated with increasing concentrations of WCEs 
(0.06, 0.24, 0.96 Lg pl *); The hnRNPA1 and hnRNPA2/B1 bound to DNA and 
the remaining RPA2 on DNA were analysed by western blot. d, RPA-coated 
ssTEL or ssMUT (0.8 nM) was incubated with increasing concentrations of 
purified hnRNPAI (2.4, 4.8, 7.2nM). The remaining RPA2 on DNA was 
analysed as in a. e, POT1-coated ssTEL (0.8 nM) was incubated with increasing 
concentrations of purified hnaRNPA1 (2.4, 4.8, 7.2nM). 
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Figure 3 | Regulation of RPA displacement by TERRA. a, Nuclear extracts 
(34 ng ul?) were treated with increasing concentrations (2, 4, 10, 20nM) of 
TERRA (UUAGGG);, control RNA (CCCUAA); or mock treated. The treated 
nuclear extracts were then incubated with RPA-coated ssTEL (2 nM), and the 
remaining RPA2 on ssTEL was analysed after the incubation. b, The RPA 
displacing factors were captured with RPA-ssTEL as in Fig. 2a. The elution was 
incubated with TERRA or control RNA, then applied to RPA-ssTEL. 

c, Purified hnRNPA1 (4.8 nM) was incubated with increasing concentrations of 


IfhnRNPA1 displaces RPA from telomeric ssDNA, how can POT1 
bind to telomeric ssDNA? Given that hnRNPAI has affinity for both 
telomeric ssDNA and TERRA, the presence of TERRA at telomeres 
may promote the dissociation of hnRNPA1 from telomeric ssDNA. 
Indeed, when hnRNPA1-coated ssTEL was incubated with TERRA, 
hnRNPAI was stripped from ssTEL (Fig. 3d), showing that hnaRNPA1 
binds telomeric ssDNA dynamically. Furthermore, when hnRNPA1- 
ssTEL was incubated with TERRA and POT1-TPP1, POT] efficiently 
bound to ssTEL as hnRNPA1 was removed by TERRA (Fig. 3e). 

The in vitro results above suggest that the initial displacement of 
RPA from telomeric ssDNA may be performed by hnRNPs when 
TERRA levels are low at telomeres (Fig. 3f). However, if TERRA levels 
rise at telomeres, hnRNPA1 may shuttle between telomeric ssDNA 
and TERRA dynamically. In this situation, both RPA and POT1 may 
have the chance to bind telomeric ssDNA. Because hnRNPA1 only 
displaces RPA, but not POT1, this dynamic process will eventually 
promote POT1 occupancy at telomeric ssDNA. 

This model raises the possibility that the RPA displacing activity 
may be regulated by TERRA during the cell cycle. To test this, we 
generated WCEs from cells in G1, early S, late S and M phases of 
the cell cycle. RPA was more efficiently displaced in the late S- and 
M-phase extracts than in the G1- and early S-phase extracts (Fig. 4a 
and Supplementary Fig. 7a, b). Thus, the RPA displacing activity is low 
in G1 and early S phase, but upregulated in late S phase. 

If TERRA inhibits the RPA displacing activity, its levels should 
inversely correlate with the activity. Furthermore, removal of 
TERRA in early S phase should alleviate the inhibition. Indeed, a 
recent study showed that TERRA levels significantly decrease in late 
S phase and increase again after S phase’. Consistently, telomeric 
TERRA foci declined as cells progressed from early to late S phase 
(Fig. 4b and Supplementary Fig. 7c, d). In addition, RNase A treatment 
of early S-phase extracts significantly enhanced the RPA displacing 
activity (Supplementary Fig. 7e). Together, these results suggest that 
TERRA inhibits RPA displacement in early S phase, and its decline in 
late S phase may provide a window for RPA displacement. 
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TERRA or control RNA (2, 4, 10, 20 nM), then incubated with RPA-ssTEL (0.8 
nM). d, hnRNPA1-coated ssTEL (0.8 nM) was incubated with increasing 
concentrations of TERRA (2, 20, 200, 2,000 nM). The remaining hnRNPA1 on 
ssTEL was analysed by western blot. e, hnRNPA1-coated ssTEL (0.8 nM) was 
incubated with increasing concentrations of TERRA (2, 20, 200 nM) in the 
presence of POT1-TPP1 (2.4nM). The hnhRNPA1 and POT! on ssTEL were 
analysed by western blot. f, A model for RPA displacement. 


The model above also predicts that hnRNPs are necessary for RPA 
displacement from telomeres. Depletion of hnRNPA1 using two inde- 
pendent short interfering RNA (siRNAs) significantly increased the 
fraction of cells displaying RPA foci (Fig. 4c and Supplementary Fig. 
8a-d). Notably, a fraction of the RPA foci in hnaRNPA1 knockdown 
cells closely associated with TRF2 foci. Furthermore, increased RPA 
binding at telomeres was detected in hnRNPA1 knockdown cells by 
chromatin immunoprecipitation (Fig. 4d). In synchronized hnhRNPA1 
knockdown cells, RPA binding to telomeres was enhanced in early S 
phase (Supplementary Fig. 9a, b), indicating that even during this 
period some hnRNPAI remains free from TERRA and limits RPA 
binding to telomeres’. In late $/G2, RPA still declined at telomeres 
in hnRNPA1 knockdown cells, possibly owing to the redundancy 
among hnRNPs. 

If the displacement of RPA by hnRNPA1I is a prerequisite for POT1 
binding, POT1 should be needed for RPA exclusion after late S phase. 
To assess this possibility, we treated cells with POT1 siRNA and syn- 
chronized the cells with thymidine as POT1 levels declined (Sup- 
plementary Fig. 10a, b). After POT1 knockdown cells and control cells 
were synchronously released, RPA foci appeared in both cell popula- 
tions (Fig. 4e). As control cells entered G2, RPA foci rapidly declined. 
In contrast, the POT1 knockdown cells containing RPA foci that co- 
localized with TRF2 continued to increase. Concomitantly, modest 
Chk1 phosphorylation was detected in POT1 knockdown cells 
(Supplementary Fig. 10c). Thus, reduction of POT1 compromises 
the exclusion of RPA from telomeres after replication’’. 

During early to middle S phase, TERRA sequesters hnRNPs and 
allows RPA to bind telomeric ssDNA at replication forks or telomere 
ends (Supplementary Fig. 1). When TERRA levels decline in late S 
phase, hnRNPs are unleashed to displace RPA from telomeric ssDNA. 
The dynamic binding of hnRNPs to telomeric ssDNA is gradually 
antagonized by TERRA when TERRA reaccumulates at telomeres, 
providing a window for both RPA and POT! to bind. Because only 
POTI1, but not RPA, binds to telomeric ssDNA irreversibly in the 
presence of hnRNPs, this dynamic process favours the formation of 


©2011 Macmillan Publishers Limited. All rights reserved 


Early Late 


a WCE - - ss M cy 


RPA-ssTEL + + 

a oo Send 
50% 100 100% 

Cyclin A | 


1 _ | 


Input 


LETTER 


TRF2 Merge DAPI 


=™Mock 
mhnRNPA1-1 
=hnRNPA1-2) 


o 
oO 


Mock 


a 
Oo. 


peel i. 


Cells with RPA foci (%) 
nm 
is) 


ar 


ChIP Ab RPA2 Ab1 RPA2 Ab2 


15 


1.0 


0.5 


Telomere binding 
(percentage of relative input) 


Figure 4 | hnRNPA1 and POT! suppress the accumulation of RPA at 
telomeres. a, RPA-coated ssTEL was incubated with WCEs from cells in G1, 
early S, late S and M phases of the cell cycle (see Supplementary Methods). The 
remaining RPA2 on ssTEL was analysed after incubation. Cyclin A and 
phospho-histone H3 serve as cell-cycle markers, and histone H4 as a loading 
control. b, TERRA was analysed by RNA fluorescence in situ hybridization in 
HeLa cells after thymidine release. TRF2 serves as a marker of telomeres. 

c, HeLa cells were treated with hnRNPA1 siRNA or mock treated, then 
immunostained with antibodies to RPA2 and TRE2 (left panel). The cells with 


POT 1-coated telomeric ssDNA. Unlike RPA, POT1 kinks telomeric 
ssDNA and induces its self-recognition**”’. These unique properties of 
POTI may confer resistance to hnRNP-mediated displacement. The 
cell-cycle-regulated RPA displacement may allow RPA to transiently 
associate with telomeric ssDNA during replication, and prevent per- 
sistent ATR activation at telomeres after S phase. Once coated by 
POT1, telomeric ssDNA may remain capped until the arrival of rep- 
lication forks in the next S phase. Together, TERRA and hnRNPs 
orchestrate a cell-cycle-regulated RPA-to-POT1 switch on telomeric 
ssDNA, ensuring orderly telomere replication and capping. 


METHODS SUMMARY 


To analyse the bindings of RPA and POT1-TPP1 to ssDNA, biotinylated ssDNA 
was attached to streptavidin-coated magnetic beads. Biotinylated ssDNA (1 pmol) 
was incubated with purified protein in 500 ul of binding buffer (10 mM Tris-HCl 
(pH 7.5), 100mM NaCl, 10 pg ml~’ BSA, 10% glycerol, 0.05% NP-40). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

ATR activation. To generate the 800-base-pair (bp) telomeric dsDNA fragment, 
the pSTY11 plasmid (a gift from T. de Lange) was digested with EcoRI and the 
excised fragment was gel purified. The 800-bp random sequence dsDNA was 
generated by PCR and column purified. These dsDNA fragments were incubated 
with T7 exonuclease for 15 s at room temperature (approximately 23 °C) and flash 
frozen in an ethanol-dry-ice bath. T7 was inactivated by subsequent incubation at 
70 °C for 20 min and DNA fragments were separated on 2% agarose gel to confirm 
equal resection. The resected DNA fragments were incubated with nuclear extract 
as previously described’*. To monitor specifically the phosphorylation of RPA2 by 
ATR and eliminate the contributions of ataxia telangiectasia mutated (ATM) and 
DNA-PK to RPA2 phosphorylation, nuclear extracts were pre-treated with 20 1M 
KU55933 and NU7026 inhibitors for 15 min at 4 °C. The extracts were mixed with 
the DNA fragments, incubated for 15 min at 37 °C, and RPA phosphorylation was 
analysed by western blot. 

Protein purification. The POT1-TPP1 complex was either purified from 
baculovirus-infected Sf9 cells as previously described’®, or purified from 
HEK293E cells as follows. The pCL-Flag-POT1 and pCL-Flag-TPP1 vectors’® 
were individually transfected or co-transfected into HEK 293E cells. The cells were 
collected after 72h and lysed in the NETN buffer (100 mM NaCl, 1mM EDTA, 
20 mM Tris-HCl (pH 8.0), 0.5% NP-40 and protease inhibitors), sonicated and 
cleared by centrifugation (10,000g for 10 min). The cleared lysates were incubated 
with the M2 anti-Flag antibody-conjugated beads at 4°C for 2h and eluted with 
200p1gml~' 3X Flag peptide in buffer A (25mM Tris-HCI (pH 8.0) 100mM 
NaCl, 10% glycerol) for 1h. Recombinant RPA complex was purified from E. coli 
as previously described*’. hnRNPA1 pET9d plasmid (a gift from A. Krainer) was 
transformed into E. coli and expression was induced with IPTG (0.4 mM) for 3 hat 
37 °C. The cells were then collected and lysed in binding buffer (10 mM Tris-HCl 
(pH 7.5), 100mM NaCl, 101g ml~’ BSA, 10% glycerol, 0.05% NP-40). Lysates 
were sonicated, cleared by centrifugation (10,000g for 10 min) and incubated with 
ssTEL- (50 1M) conjugated M280 beads (100 ul) for 30 min at room temperature. 
The ssTEL and associated protein were captured by magnets, washed in binding 
buffer and eluted with 1 M NaCl for 10 min at 4°C. The eluted protein was then 
diluted in binding buffer without salt to bring the final NaCl concentration down 
to 100 mM. E. coli single-stranded binding protein (SSB) was purchased from 
Promega. 

Gel-shift assay. The 18-nucleotide telomeric ssDNA probe [(TTAGGG)s3] was 
radiolabelled with y-’P using T4 kinase and purified over a G25 column. The 
labelled ssDNA was incubated with purified RPA or POT1-TPP1 in binding 
buffer (10 mM Tris-HCl (pH 7.5), 100 mM NaCl, 10 pg ml | BSA, 10% glycerol, 
0.05% NP-40) for 30 min at room temperature. The resulting protein-DNA com- 
plexes were separated by gel electrophoresis using 0.8% agarose at 140 V for 1.5h 
and bands were visualized by autoradiography. 

DNA-protein binding assay using biotinylated ssDNA. Biotinylated ssTEL 
[((ITAGGG)gs] or ssMUT [(TTTGCG)s] were attached to streptavidin-coated 
magnetic beads in 10 mM Tris-HCl (pH 8.0), 100 mM NaCl at room temperature 
for 30 min. To analyse the bindings of purified RPA, POT1-TPP1 and POT] to 
ssDNA, biotinylated ssDNA (1 pmol) was incubated with various amounts of 
purified protein in 500 pl of binding buffer. To analyse the binding of RPA and 
Flag-POT1 to ssDNA in extracts, biotinylated ssDNA (10 pmol) and various 
amounts of extracts were added to 500 ul of binding buffer. After incubation for 
30 min, the protein-DNA complexes were retrieved with a magnet and washed 
three times with binding buffer. In the experiments using RPA or POT! pre- 
coated ssDNA, biotinylated ssDNA (1 pmol) was first incubated with purified 
protein (3.8 pmol) for 30 min at room temperature. The ssDNA pre-coated with 
RPA or POT! was retrieved with a magnet and subsequently mixed with increas- 
ing concentrations of WCE, nuclear extract or cytoplasmic extract for 30 min at 
room temperature. For nuclear extract inhibited by addition of TERRA or its 
derivatives ((UUAGGG)3, (CCCAUU)3 and (UUGGCG)3), extracts were incu- 
bated with 1, 2,5 or 10 pmol RNA for 30 min at 4 °C. 

For hnRNPAI binding, RPA-coated ssTEL or ssMUT (0.8 nM), or Flag~POT1- 
coated ssTEL (0.8nM), were incubated with increasing concentrations of 
hnRNPAI purified from E. coli (2.4, 4.8, 7.2nM) and the proteins remaining on 
ssTEL were analysed by western blot. For TERRA inhibition, hnRNPA1 was pre- 
incubated with increasing concentrations of TERRA (2, 4, 10, 20 nM), or control 
RNA (UUGGCG)3. hnRNPA1 was then incubated with RPA coated ssTEL 
(0.8nM). Similarly, to demonstrate that TERRA promotes the dissociation of 
hnRNPAI from ssTEL, the ssTEL (0.8nM) was pre-coated with hnRNPAI 
(2.4nM) and subsequently incubated with increasing concentrations of TERRA 
(2, 20, 200, 2,000nM). To demonstrate that TERRA enhances POT1 binding, 
ssTEL (0.8 nM) was pre-coated with hnRNPAI (2.4 nM) then incubated with both 
POT! (2.4nM) and increasing concentrations of TERRA (2, 20, 200 nM). In all 
reactions, the proteins remaining on DNA were analysed by western blot. 
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Cell synchronization. To follow the progression of cells from S to G2 (Fig. 4b, e), 
HeLa cells were synchronized with 2 mM thymidine for 16 h, washed three times 
with PBS and once with thymidine-free medium, and released into thymidine-free 
medium. To enrich HeLa cells in S phase of the cell cycle (Fig. 4a), cells were either 
collected after treatment for 16h with 2 mM thymidine (early S), or collected 4h 
after thymidine release (late S). To enrich cells in G1 and M phases, cells were 
either collected after treatment for 16h with 0.1 pg ml ! nocodazole (M), or 
collected 4h after nocodazole release (G1). 

Extract preparation. WCEs were either generated with the NETN buffer as 
described in the protein purification section, or with the binding buffer used in 
the DNA binding assays. Nuclear extract and cytoplasmic extract were generated 
as previously described’. To treat extracts with TERRA or its derivative RNA, RNA 
was added to WCE or nuclear extract in increasing concentrations (1,2, 3, 
10 pmol) and incubated for 30 min on ice. 

Capture of RPA-displacing activity from extracts. To capture the RPA-displacing 
activity from extracts, RPA-coated ssTEL was incubated with nuclear extract for 
30min at room temperature. The beads were collected, washed three times in 
binding buffer, and eluted using the binding buffer with 1 M NaCl for 10 min on 
ice. The eluted material was collected, and diluted with the binding buffer without 
NaCl to reach a final NaCl concentration of 100 mM. The elution was incubated on 
ice for 1h then added to RPA-coated ssDNA and incubated for 30 min at room 
temperature. For TERRA inhibition, either TERRA (UUAGGG); or its derivative 
(CCCAUU); were incubated with the eluted proteins before their addition to RPA- 
coated ssDNA. The proteins remaining bound to DNA were analysed by western 
blot. 

Identification of the RPA-displacing factors from extracts. Biotinylated ssTEL 
or ssMUT (20 pmol) was attached to streptavidin-coated beads and coated with 
recombinant RPA. The RPA-coated ssTEL or ssMUT was incubated with 65 1g of 
nuclear extract in 500 ul of binding buffer. Beads with no DNA attached were used 
a negative control. After 30min of incubation, the beads were retrieved and 
washed three times with binding buffer containing 300 mM NaCl. The proteins 
associated with the RPA activity were eluted by binding buffer containing 600 mM 
NaCl for 10 min on ice. The eluted proteins from ssTEL, ssMUT and naked beads 
were separated by SDS-PAGE. After the gel was silver-stained, the two ~30-kDa 
bands specifically captured by RPA-ssTEL were excised and analysed by mass 
spectrometry. 

Immunofluorescence analysis. HeLa cells were seeded onto coverslips and cul- 
tured overnight. The adhered cells were transfected with POT1 siRNA using 
oligofectamine (Invitrogen), or with hnRNPA1 siRNA using Lipofectamine 
RNAi Max (Invitrogen) and cultured for another 48h. Synchronized cells were 
treated after 24h with 2mM thymidine for 16h, washed and released, and pro- 
cessed at the indicated time points. Cells were extracted with 0.25% Triton, fixed in 
3% paraformaldehyde and further permeablized with 0.5% Triton. Cells were 
subsequently incubated with the primary antibodies (diluted in PBS containing 
3% BSA and 0.05% Tween 20) for 1h at 37°C in a humidified chamber. After 
extensive washing with PBS, cells were incubated with secondary antibodies for 
45 min at room temperature, and washed again with PBS. After incubation for 
5min with DAPI, cells were mounted on slides with Vectashield. Slides were 
analysed using a Nikon H600L fluorescence microscope. 

Combined immunofluorescence-RNA fluorescence in situ hybridization. 
Cells were grown on coverslips and collected at different time points 17h after 
release of single thymidine block. Cells were washed twice with cold PBS for 5 min 
and treated with cytobuffer (100mM NaCl, 300mM sucrose, 3mM MgCl, 
10 mM PIPES pH 7, 0.1% Triton X- 100, 200 mM vanadyl ribonucleoside complex) 
for 7 min at 4 °C. Cells were rinsed briefly, fixed with 4% paraformaldehyde in PBS 
(USB 19943) for 10 min at room temperature. Cells were then washed three times 
with PBS for 5 min each and permeabilized with 0.5% NP40 in PBS for 10 min. 
Cells were washed twice with PBS for 5 min each and incubated with blocking 
solution (0.2% fish gelatin and 0.5% BSA) for 1h. Cells were then incubated with 
human TRE2 antibody (clone 44794 Upstate) at 1:2,000 and diluted in blocking 
solution for 2h. After washing three times with PBST (PBS containing 0.1% 
Triton) for 10 min each, the cells were than incubated with secondary antibody 
Alexa 488 (Invitrogen A11001) at 1:2,000 dilution in blocking solution for 1h. 
Cells were washed three times with PBST for 10 min each and were fixed with 4% 
paraformaldehyde in PBS for 10min at room temperature. Cells were rinsed 
briefly with PBS then incubated with hybridization mix (10 nM PNA-TAMRA- 
(CCCTAA) probe, 50% formamide, 2 SSC, 2 mg ml! BSA, 10% dextran sul- 
phate, 10 mM vanadyl ribonucleoside complex) for 18 h in a humidified chamber 
at 39°C. Cells were washed with 2X SSC in 50% formamide three times at 39 °C 
for 5 min each, three times in 2X SSC at 39 °C for 5 min each, and finally once in 2X 
SSC at room temperature. for 10 min. Coverslips were than mounted on glass micro- 
scope slides with Vectashield mounting medium containing DAPI (H-1200). For 
RNaseA treatment, coverslips were incubated with 200 pg ml! RNase A for 
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30 min at 37 °C before hybridization. Images were captured with an Endore cooled 
CCD (charge-coupled device) camera on a Nikon eclipse 80i microscope and the 
images processed with NIS-Element BR 3.10 software. 

Chromatin immunoprecipitation. RPA chromatin immunoprecipitation and 
the analysis of telomere association were performed as previously described’. 
Cells were transfected twice with hnRNPA1I siRNA (hnRNPA1-1) and synchro- 
nized with thymidine for 15h. The two RPA2 antibodies used were from Abcam 
and Thermo. 

Antibodies and siRNA. The RPA pS33 antibody was from Bethyl. The mono- 
clonal antibody to RPA2 was from Neomarkers. The anti-FLAG M2 antibody was 
from Sigma. The Chk1 antibody and Cyclin A antibody were from Santa Cruz, and 


the phospho-Chk1 Ser345 antibody was from Cell Signaling. The TRF2 antibody 
was from Bethyl. The phospho-H3 Ser10 antibody was from Millipore. The H4 
antibody was from Active Motif. The hnRNPA1 antibody was from Cell Signaling. 
The POT1 siRNA used in Fig. 4e and Supplementary Fig. 10 was the 
SMARTPOOL from Dharmacon. The hnRNPA1 siRNAs used in Fig. 4c, d and 
Supplementary Figs 8 and 9 were CAACUUCGGUC-GUGGAGGA and 
UCCACGACCACCACCAAAG. 


30. Henricksen, L. A. & Wold, M. S. Replication protein A mutants lacking 
phosphorylation sites for p34cdc2 kinase support DNA replication. J. Biol. Chem. 
269, 24203-24208 (1994). 
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CANCER RESEARCH 


Promise of protection 


Cancer vaccines have long shown lots of potential but few 
results. Signs of success now suggest opportunities. 


BY KELLY RAE CHI 


| ast April, after years of false starts and 


frustration, the first-ever therapeutic 
cancer vaccine gained approval from the 
US Food and Drug Administration (FDA). 
Provenge (sipuleucel-T), made by Dendreon 
in Seattle, Washington, has been demonstrated 
to extend the survival of patients with late-stage 


prostate cancer by several months, and its 
success has shown the way for other cancer vac- 
cines, says James Gulley, a medical oncologist 
at the National Cancer Institute in Bethesda, 
Maryland. “We knew there were regulatory 
challenges that the FDA needed to look at to get 
this approved,’ he says. “So it set the milestones 
for future paths for our therapeutic vaccines.” 
Cancer vaccines have had a boost in the past 


three years, thanks in part to Provenge’s suc- 
cess as well as to a growing number of prom- 
ising therapies in late-stage clinical trials, 
says Gulley. Pharmaceutical companies have 
approached him about developing therapeu- 
tic-vaccine commercialization programmes, 
and small biotechnology firms have shown 
interest in combining cancer-drug projects 
with vaccine development. 

Hundreds of clinical trials of cancer vaccines 
are currently under way in the United States, 
and Hyam Levitsky, an oncologist at the Sid- 
ney Kimmel Comprehensive Cancer Center at 
Johns Hopkins University in Baltimore, Mary- 
land, sees more promise now than at any time 
during his 20 years in the field. People thinking 
about going into cancer-vaccine work can be 
assured that “there is going to be a future in 
this’, he says. 

Levitsky and his colleagues are optimistic 
that cancer vaccines will find their way into the 
standard of care not only because of the suc- 
cess of Provenge but also because of the wealth 
of trials going on. All this signals research 
opportunities in the near future. Some of the 
available jobs, from discovery to manufac- 
turing and testing, are the same as those in 
the field of conventional vaccines. But can- 
cer vaccines — immunotherapies that trig- 
ger or enhance defences against tumours or 
cancer-causing pathogens — are particularly 
challenging to develop, and most are still in 
the experimental stages. They work through 
a variety of mechanisms, making it difficult 
for scientists to develop standards to test and 
approve them. For the best chance of advanc- 
ing in the field, researchers must have a solid 
grounding in immunology and the ability 
to understand how cancers interact with the 
immune system. 


BENCH TO BEDSIDE 

Cancer vaccines fall into two major catego- 
ries: prophylactic vaccines, which are given to 
healthy people to prevent cancer developing, 
and therapeutic vaccines, which are intended 
to treat an existing cancer by strengthening the 
body’s natural defences. Therapeutic vaccines 
are usually made from cancer cells, or from 
viruses or molecules present in tumours of 
some or many types of cancer, whereas pre- 
ventive vaccines typically target viruses that 
cause cancer. Most vaccines of both types are 
developed using a one-size-fits-all approach, 
such as targeting molecules that are present in 
the cancers of many patients, but some, includ- 
ing Provenge, are tailored to an individual > 
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> patient. 

Compared with infectious diseases, cancer 
poses some extra challenges for vaccine-mak- 
ers — enough to give pause even to optimistic 
scientists and clinicians. Investigators must 
ensure that their vaccines target tumour- 
specific cells without triggering a more gen- 
eral immune response against normal cells. 
Whereas traditional 
prophylactic vaccines 
prevent infections in 
healthy people, ther- 
apeutic cancer vac- 
cines must be given 
to patients whose 
immune systems 
have been weakened 
by conventional can- 


cer treatments such “It’s immensely 
as chemotherapy. rewarding to 
Vaccines can take seethat we 

time to work, sothey can make a 

offer small comfort difference.” 

for those with onlya James Gulley 


few months to live. 

And creating personalized therapies entails 
further scientific and commercial difficul- 
ties: it often requires harvesting the patient’s 
own cancer cells or developing companion 
diagnostics to identify subsets of patients who 
might benefit from the therapy. “Doing large 
trials and getting drugs approved and funded 
when they’re individualized is very, very dif- 
ficult? says John Sampson, a neuro-oncologist 
at Duke University Medical Center in Durham, 
North Carolina. 

Anyone wishing to develop cancer vaccines 
must have a understand tumour immunology, 
because much of the basic science focuses on 
understanding the role of the immune system 
and inflammation in the development and 
progression of cancer. Only a few institutions 
offer doctoral programmes specifically in can- 
cer immunology; they include the University 
of Lausanne in Switzerland and the State Uni- 
versity of New York at Buffalo. The majority 
of cancer immunologists come from either 
immunology or cancer-biology programmes. 

Newcomers to the field should keep in mind 
that cancer-drug development is risky: it can 
take a decade or longer for a drug to get from 
the lab to the clinic. “Your chances of get- 
ting something that works are extraordinar- 
ily small,” says Sampson. Provenge needed 
20 years and three phase III clinical trials to 
reach FDA approval. 

Most big pharmaceutical companies are still 
watching from the sidelines, choosing not to 
fund or hire for early-stage development, says 
Karl-Josef Kallen, chief scientific officer of 
CureVac, a biotechnology firm based in Tubin- 
gen, Germany. Vincent Tuohy, an immunolo- 
gist at the Cleveland Clinic Lerner Research 
Institute in Ohio, has struggled to raise funds 
for a phase I clinical trial of a prophylactic 
vaccine aimed at women with a high risk of 


developing breast cancer. “I'm having a rough 
time trying to convince the pharmaceutical 
industry, whose business model is based pre- 
dominantly on treatment and diagnostics, to 
enter the preventive space for these cancers,” 
says Tuohy, who has also been turned away by 
government funding agencies and non-profit 
organizations. 

One exception is GlaxoSmithKline (GSK), 
whose immunotherapeutics division is looking 
for clinical development managers for early- 
and late-stage cancer-vaccine development, 
medical affairs and project management at 
its biologics research hub in Rixensart, Bel- 
gium. The ideal candidates have experience in 
immunotherapeutics or general oncology, or 
specialize in specific types of cancer, says Roya 
Paganini, senior manager in charge of talent 
acquisition at GSK Biologicals in Rixensart. 

Smaller, discovery-focused biotechnology 
companies may have more opportunities. 
CureVac, for example, is looking for PhD- 
level research scientists to design vaccines 
and plan experiments to test them. They also 
need to be able to tweak the therapy’s dose 
and dosing schedule in combination with 
existing and new therapies, to demonstrate 
its efficacy. 


TRIAL RUNS 

Most training in designing and running 
clinical trials for cancer-drug development is 
simply not that helpful for vaccine develop- 
ment. Unlike traditional cancer therapies, for 
example, vaccines do not have a ‘maximum 
tolerated dose’ In fact, the appropriate dos- 
ing, scheduling of treatment and end points 
for vaccines all differ vastly from those for 
traditional drugs. 

That makes immunology training, whether 
formal or on-the-job, essential for success. 
Principal investigators who are involved in 
patient care usually have medical doctorates, 
board certification in internal medicine or pae- 
diatrics, and experience in an oncology fellow- 
ship lasting several years. Immunology is not a 
distinct subspeciality of oncology, so research- 
ers should decide early on whether clinical 
research centred on cancer immunology is a 
path they wish to pursue. During an oncology 
fellowship, scientists with an interest in cancer 
vaccines should seek a mentor with experience 
as a faculty investigator, says Levitsky, adding 
that fellows might expect to design and help to 
analyse the results of one or two trials. By the 
time Gulley had finished his fellowship at the 
National Institutes of Health in Bethesda, he 
had nearly completed one clinical trial and had 
ideas for several more. Fellows training with 
Levitsky also sit in on first-year medical-school 
immunology courses and attend specialized 
seminars and journal clubs to stay up-to-date 
in the field. 

Trials need skilled clinical-research coor- 
dinators, who often work with or under the 
principal investigator to speed recruitment 
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or reduce costs in clinical trials. “I think we 
need to find people who are much better at 
figuring out ways to do clinical trials well but 
efficiently,’ says Sampson, who has graduated 
from a ‘master of health sciences in clinical 
research programme at Duke. The university 
also offers a non-degree option for clinicians, 
nurses and scientists. 


MAKING VACCINES 
In principle, making cancer vaccines is not 
that different from making monoclonal anti- 
bodies or other biologics, so one need not 
have a cancer-research background to get 
into manufacturing, quality control or quality 
assurance, says George Mitra, director of the 
Biopharmaceutical Development Program at 
the National Cancer Institute’s facility in Fred- 
erick, Maryland. However, individualized vac- 
cines can be more like a specialized research 
project, in which clinicians collect cells from 
a patient with cancer and use them to make a 
vaccine for that patient. Making such vaccines 
requires more skilled labour and experience 
than making generalized vaccines, says Mitra. 
Creating Provenge, for example, involves a 
process called leukapheresis, in which antigen- 
presenting cells are harvested from a patient's 
blood and purified. The cells are then sent to a 
Dendreon facility, where researchers cultivate 
them with a proprietary manufactured protein 
commonly found on prostate-cancer cells. The 
resulting vaccine is returned to the patient’s 
physician. It must be administered in three 
different treatments, each of which requires 
the same manufacturing process. 
Although the 
odds are long, if an 
individualized vac- 
cine becomes com- 
mercially feasible, 
industrial manufac- 
ture would require 
PhD- and master’s- 
level scientists with 
a background in cel- 


“There is going lular processing, who 
“st bea future im are trained in blood 
this. 


apheresis, bone-mar- 
row transplantation 
or graft engineering, 
as opposed to more traditional drug manu- 
facturing areas, says Levitsky. 

If the field continues to grow, interested sci- 
entists will have options. But the field’s biggest 
draw may be the potential to help desperate 
patients. “To me, it’s immensely rewarding to 
see that we can make a difference in patients’ 
outcomes and survival, without causing sig- 
nificant side effects,” says Gulley. “That’s 
what’s going to keep me working in this field 
and make me try for bigger improvement in 
outcomes for these patients.” m SEE OUTLOOK P.450 


Hyam Levitsky 


Kelly Rae Chi is a freelance journalist based 
in Cary, North Carolina. 


TURNING POINT 


Richard Green 


Richard Green, a computational biologist 

at the University of California, Santa Cruz, 
was one of 118 young researchers to wina 
US$50,000 two-year research fellowship 
from the Alfred P. Sloan Foundation, a 
philanthropic institution based in New York 
City. Green tells Nature how the fellowship 
will help to distinguish his work from that of 
his mentors. 


Why did you choose computational biology 
over bench work? 

I found that the cycle of determining whether 
an experimental idea is good or bad was much 
faster with a computer than a pipette — which 
is a big deal for someone like me, with many 
varied interests I hope to explore. 


What is the market like for computational 
biologists? 

There is a huge demand, but not that many 
people who marry a deep understanding of 
molecular biology with the ability to think in 
terms of algorithms. It’s really a seller’s market 
for people with these skills. As a result, ’m in 
the enviable position of getting to pick and 
choose what I want to work on. It’s important 
to note that there is not some new discipline 
called ‘computational biology’; it’s really a 
third entity that combines biology and compu- 
tational ability. I suggest that young scientists 
become specialists in both disciplines. 


Has the prolific generation of genomics data 
altered career expectations? 

The bar has been raised: the sequencing 
and basic analysis of a genome is no longer 
an automatic paper in Science or Nature. 
But that’s a good thing. We have to be more 
creative; for example, species’ genomes can 
be compared to gain an evolutionary per- 
spective of the transition to multicellularity. 


How did you navigate your postdoc? 

If grad school is where a young scientist's ship 
is assembled, then the postdoc is the launching 
pad where you take off — or not. I had a great 
set of projects as a graduate student in Steven 
Brenner’s lab at the University of California, 
Berkeley. One of my last projects was inves- 
tigating how alternative gene splicing evolves 
in flies. As a postdoc, I wanted to ask the next 
obvious question: how quickly did alternative 
splicing evolve between chimps and humans? 
I was fortunate that my adviser introduced 
me to Svante Paabo, director of evolutionary 
genetics at the Max Planck Institute for Evo- 
lutionary Anthropology in Leipzig, Germany. 


Having your boss put you on these top 
researchers radar screens is incredibly helpful. 


How did you get involved with the 
Neanderthal genome sequencing project? 
Blind luck. Shortly after I arrived, Svante was 
experimenting with a new shotgun sequenc- 
ing method before applying it to the precious 
few samples of Neanderthal DNA extracted 
from bone remnants. He wanted someone 
to align the sequences of ancient cave-bear 
DNA derived using this method to other 
genome sequences in the database — some- 
thing I could easily do, so I volunteered to 
impress my boss. Those alignments proved 
that the method worked. We then applied it 
to some of our best Neanderthal extracts, and 
Svante encouraged me to work on this. The 
Neanderthal project was such an incredible 
opportunity that I left the alternative splicing 
to the side. 


Did that cause problems with funders? 

A little. I dutifully told my funders about my 
change in focus — and I had to reapply under 
a different programme to avoid getting my 
funding pulled. It was unfortunate that the 
funders didn’t immediately see that this was 
a once-in-a-lifetime project and that I was 
perfectly positioned to do it. 


Is your work riskier than most scientists’? 
Maybe. One of the hallmarks of my research 
is that I do many things. This traditionally 
has been viewed as a weakness. If you have 
many areas of expertise, they can erode one 
another in people's perceptions. I want to use 
my expertise to move different fields forward; 
for example, sequencing the alligator genome 
will offer new insight into developmental as 
well as evolutionary biology. = 
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LEGISLATION 


Gender equality bid 


For the third time since 2008, US legislators 
have introduced a bill in the House of 
Representatives aimed at tackling gender 
inequalities in research. Eddie Bernice 
Johnson (Democrat, Texas), the sponsor of 
the bill, argues that the paucity of women 
in fields such as physics constitutes a 
competitive disadvantage for the United 
States. The bill would compel all US 
funding agencies to extend grants or fund 
temporary workers when researchers need 
time off to care for their families. The first 
try at the legislation died in committee in 
2008; the second, last year, hitched it to 
another bill from which it was stripped 
before passage. Progress depends on the 
House Committee on Science, Space, 

and Technology and its chairman, Ralph 
Hall (Republican, Texas), who will decide 
whether the bill will be considered. 


BIOMEDICAL FUNDING 


Disheartening cuts 


The effects of a US budget crunch could 
drive young scientists into other jobs, 
warns Francis Collins, director of the 
National Institutes of Health (NIH). At 

a Washington DC panel on 15 March, 
Collins noted that NIH grant success rates 
have slid from 25-30% to 20% and below 
in the past 4 years, owing to cuts. “Are these 
people going to keep banging their heads 
against the wall? Or are they going to find 
some other way to make a living?” said 
Collins at the talk, which was presented 

by Research! America, an advocacy group 
in Alexandria, Virginia. Few legislators 
understand the importance of investing 

in research, said panelist and former US 
representative Mike Castle. “Agencies need 
to get their story out; he said. 


NASA 


Agency woos women 


NASA has built a website to boost 
women’s and girls’ familiarity with the 
agency, raise their interest in working 
there and increase recognition of female 
contributions to aeronautics. The site, 
http://women.nasa.gov, was launched 

on 16 March and gives glimpses into the 
work lives and accomplishments of female 
astronauts, researchers and engineers 
through videos and essays. A careers page 
provides links to job openings, including 
graduate and postdoc fellowships. A 
NASA spokeswoman says that the agency 
is trying to change the perception that it is 
male-dominated. 
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BY PETER ROBERTS 


he waiting room was comfortably 
‘Te The whole place had clearly 

been engineered to be non-threat- 
ening, even calming, while still looking 
professional: the walls were a pale, 
matte-textured powder blue, and the 
furniture and accessories were tastefully 
nondescript in various darker shades of 
blue. There was an air-conditioning vent 
on the wall, not quite directly above my 
head. The outflow gently touched 
my face, almost unnoticed, much 
as apprehension gently touched my 
consciousness. 

My anticipation prior to the pro- 
cedure was probably more intense 
than that of the typical patient. I had 
more to pay attention to. Most people 
focus on the anticipated benefits of the 
procedure, with their hopeful attitude 
perhaps slightly tempered by a mild 
anxiety about possible side effects or 
bad outcomes — a wholly unjusti- 
fied anxiety if you ask me, as no one 
has ever experienced any adverse bodily 
consequences, and virtually everyone has 
emerged healthier (often significantly so). 
Those who undergo treatment when they’re 
still young and healthy may not receive 
much benefit, but they are not harmed. I 
should know — I’m part of the team that 
developed the procedure. 

And because of my role in the develop- 
ment team, I was, unlike most of the others 
patiently awaiting their turn in the bio- 
culture chamber, familiar with the reports 
of bad psychological reactions to the pro- 
cedure. These traumatic incidents, the only 
reported negative outcomes, are relatively 
rare, but can be quite severe, even some- 
what debilitating, at least for a time. So, as 
the eldest member of the development team, 
and therefore the first to undergo the proce- 
dure, I was ready to observe and report on 
my emotional reaction. I thought I had pre- 
pared myself adequately. I was wrong. 

Of course, I knew that the procedure 
would be complicated: age-asymmetric bud- 
ding does not come as naturally to humans 
as it does to yeasts. And we had inverted the 
process as well, further complicating matters. 
Virtually every aspect of human physiology 
had to be modified, manipulated or sus- 
pended temporarily during the course of 
treatment. But the end result was unques- 
tionably worth all the complications. Yeast 
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budding leaves a parent cell full of the mis- 
coded and broken genes, somatic impurities 
and contamination, and telomere shorten- 
ing that ageing induces (or that produce age- 
ing), while generating asa result a youthful, 
near-perfect offspring. We had altered the 
process, so that the human bud would con- 
tain all the impurities, miscodings and so 
forth, leaving a purified and rejuvenated 
parent. This seemed to us to be a major step, 
perhaps even the final one, on the road to 
human immortality. Surely these results, 
these benefits, more than compensated for 
any psychological stress. 

And yet, shockingly, some of those who 
experienced the emotional side effects 
refused to repeat the treatment. They 
had become as adamantly opposed to the 
whole enterprise as those religious fanatics 
(dwindling in absolute numbers, but not, it 
seemed, in influence) who objected to what 
they saw as ‘playing God’ 

I wont bother you with a detailed descrip- 
tion of the preparation, or of the actual pro- 
cedure, except to say that the discomforts 
were mild, and the indignities nothing out 

of the ordinary for this 
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having something ‘other’ growing, rather 
rapidly, out of my body — a sort of intimate 
violation, but without any sense of violence. 
My first thought was, “This must be 
what having a baby is like!” But of 
course I recognized that wasn’t quite 
accurate. It was more like having an 
identical twin, a conjoined twin — an 
incomplete, profoundly deformed, deathly 
ill twin, but a twin no less for all that. And 
then, shortly after the twin came into exist- 
ence, he had to be surgically removed 
and destroyed. After undergoing the 
experience, I was not surprised that 
some people found it deeply disturb- 
ing; rather, I was amazed that so few 
people were adversely affected. 
I did what I suppose most people do 
afterwards: I intellectualized the experi- 
ence. I went through the process step 
by step, reminding myself that the twin 
was genetically identical to me (aside 
from the miscodings and so on), and 
derived directly from my own body, 
like a tumour or a wart. My bud had 
no mental function to speak of, and 
hence didn’ suffer, or experience any- 
thing whatsoever. Excising and disposing 
of my twin was not very different from 
what would have been done with any other 
morbid lesion. 

But that’s not how it felt. 

My follow-up report did recommend one 
change to the aftercare protocol. I suggested 
that, if they so chose, patients should be per- 
mitted to stay with their bud, to nurse and 
nurture it until its natural demise. This, I 
hoped, would be less of a shock than having 
what had been an intimate part of oneself 
suddenly snatched away. At any rate, the 
bud’s survival time was likely to be quite 
brief. It might even be healthy for people 
to have the experience of taking care of — 
not exactly a child, but probably the closest 
thing to a child most of us will ever encoun- 
ter. Also, morbid though it may seem, this 
altered protocol would provide people with 
perhaps their only experience of serious ill- 
ness, death and mourning. 

I could muster a variety of perfectly 
rational arguments in favour of this proto- 
col change, but in the end, for me, the most 
powerful argument came down to this: it just 
seemed like the right thing to do. m 


Peter Roberts is a prolific poet: his poems 
and stories have been published in a wide 
variety of magazines and journals. 
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BRIEF COMMUNICATIONS ARISING 


Inclusive fitness theory and eusociality 


ARISING FROM M. A. Nowak, C. E. Tarnita & E. O. Wilson Nature 466, 1057-1062 (2010) 


Nowak et al.’ argue that inclusive fitness theory has been of little value 
in explaining the natural world, and that it has led to negligible pro- 
gress in explaining the evolution of eusociality. However, we believe 
that their arguments are based upon a misunderstanding of evolu- 
tionary theory and a misrepresentation of the empirical literature. We 
will focus our comments on three general issues. 

First, Nowak et al.’ are incorrect to suggest a sharp distinction 
between inclusive fitness theory and “standard natural selection 
theory”. Natural selection explains the appearance of design in the 
living world, and inclusive fitness theory explains what this design is 
for. Specifically, natural selection leads organisms to become adapted 
as if to maximize their inclusive fitness“. Inclusive fitness theory is 
based upon population genetics, and is used to make falsifiable pre- 
dictions about how natural selection shapes phenotypes, and so it is 
not surprising that it generates identical predictions to those obtained 
using other methods**”’. 

Second, Nowak et al.' are incorrect to state that inclusive fitness 
requires a number of “stringent assumptions” such as pairwise inter- 
actions, weak selection, linearity, additivity and special population 
structures. Hamilton’s original formulations did not make all these 
assumptions, and generalizations have shown that none of them is 
required*°**. Inclusive fitness is as general as the genetical theory of 
natural selection itself. It simply partitions natural selection into its 
direct and indirect components. 

Nowak et al.' appear to have confused the completely general theory 
of inclusive fitness with models of specific cases. Yes, researchers often 
make limiting assumptions for reasons of analytical tractability when 
considering specific scenarios*’, as with any modelling approach. For 
example, Nowak et al.’ assume a specific form of genetic control, where 
dispersal and helping are determined by the same single locus, that 
mating is monogamous, and so on. However, the inclusive fitness 
approach has facilitated, not hindered, empirical testing of evolutionary 
theory’’. Indeed, an advantage of inclusive fitness theory is that it 
readily generates testable predictions in situations where the precise 
genetic architecture of a phenotypic trait is unknown. 

Third, we dispute the claim of Nowak et al.’ that inclusive fitness 
theory “does not provide any additional biological insight”, delivering 
only “hypothetical explanations”, leading only to routine measure- 
ments and “correlative studies”, and that the theory has “evolved into 
an abstract enterprise largely on its own”, with a failure to consider 
multiple competing hypotheses. We cannot explain these claims, 
which seem to overlook the extensive empirical literature that has 
accumulated over the past 40 years in the fields of behavioural and 
evolutionary ecology’ (Table 1). Of course, studies must consider 
the direct consequences of behaviours, as well as consequences for 
relatives, but no one claims otherwise, and this does not change the 
fact that relatedness (and lots of other variables) has been shown to be 
important in all of the above areas. 

We do not have space to detail all the advances that have been made 
in the areas described in Table 1. However, a challenge to the claims of 
Nowak et al.’ is demonstrated with a single example, that of sex 
allocation (the ratio of investment into males versus females). We 
choose sex allocation because: (1) Nowak et al.' argue that inclusive 
fitness theory has provided only “hypothetical explanations” in this 
field; (2) it is an easily quantified social trait, which inclusive fitness 
theory predicts can be influenced by interactions between relatives; 
and (3) the study of sex allocation has been central to evolutionary 
work on the eusocial insects. In contrast to the claims of Nowak et al.', 


recent reviews of sex allocation show that the theory explains why sex 
allocation varies with female density, inbreeding rate, dispersal rate, 
brood size, order of oviposition, sib-mating, asymmetrical larval com- 
petition, mortality rate, the presence of helpers, resource availability 
and nest density in organisms such as protozoan parasites, nematodes, 
insects, spiders, mites, reptiles, birds, mammals and plants*’””. 

The quantitative success of this research is demonstrated by the 
percentage of the variance explained in the data. Inclusive fitness 
theory has explained up to 96% of the sex ratio variance in across- 
species studies and 66% in within-species studies'’. The average for all 
evolutionary and ecological studies is 5.4%. As well as explaining 
adaptive variation in behaviour, inclusive fitness theory has even 
elucidated when and why individuals make mistakes (maladaptation), 
in response to factors such as mechanistic constraints'’. It is not 
clear how Nowak et al.’ can characterize such quantifiable success 
as “meagre”. Their conclusions are based upon a discussion in the 
Supplementary Information of just three papers (by authors who 
disagree with the interpretations of Nowak et al.’), out of an empirical 
literature of thousands of research articles. This would seem to indi- 
cate a failure to engage seriously with the body of work that they 
recommend we abandon. 

The same points can be made with regard to the evolution of the 
eusocial insects, which Nowak et al.' suggest cannot be explained by 
inclusive fitness theory. It was already known that haplodiploidy itself 
may have only a relatively minor bearing on the origin of eusociality, 
and so Nowak et al.' have added nothing new here. Inclusive fitness 
theory has explained why eusociality has evolved only in monogam- 
ous lineages, and why it is correlated with certain ecological condi- 
tions, such as extended parental care and defence of a shared 
resource'*’’. Furthermore, inclusive fitness theory has made very 
successful predictions about behaviour in eusocial insects, explaining 
a wide range of phenomena (Table 2). 

Ultimately, any body of biological theory must be judged on its 
ability to make novel predictions and explain biological phenomena; 
we believe that Nowak et al.’ do neither. The only prediction made by 
their model (that offspring are favoured to help their monogamously 


Table 1| Inclusive fitness theory has been important in understanding a 
range of behavioural phenomena 


Research area Correlational? Experimental? Theory-data interplay 


Sex allocation Yes Yes Yes 
Policing Yes Yes Yes 
Conflict resolution Yes Yes Yes 
Cooperation Yes Yes Yes 
Altruism Yes Yes Yes 
Spite Yes Yes Yes 
Kin discrimination Yes Yes Yes 
Parasite virulence Yes Yes Yes 
Parent-offspring conflict Yes Yes Yes 
Sibling conflict Yes Yes Yes 
Selfish genetic elements Yes Yes Yes 
Cannibalism Yes Yes Yes 
Dispersal Yes Yes Yes 
Alarm calls Yes Yes Yes 
Eusociality Yes Yes Yes 
Genomic imprinting Yes Yes Yes 


Data are taken from refs 9-11. Correlational studies test predictions using natural variation in key 
variables, whereas experimental studies involve their experimental manipulation. Interplay between 
theory and data means that theory has informed empirical study, and vice versa. Inclusive fitness is not 
the only way to model evolution, but it has already proven to be an immensely productive and useful 
approach for studying eusociality and other social behaviours. 
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Table 2 | Areas in which inclusive fitness theory has made successful predictions about behaviour in eusocial insects 


Trait examined Explanatory variables Correlational Experimental Interplay between 
studies? studies? theory and data? 
Altruistic helping Haplodiploidy versus diploidy Yes No Yes 
Worker egg laying Worker policing Yes Yes Yes 
Policing Relatedness Yes Yes Yes 
Level of cooperation Costs, benefits and relatedness Yes Yes Yes 
Intensity of work eed for work and probability of becoming queen Yes Yes Yes 
Sex allocation Relatedness asymmetries due to variation in queen Yes Yes Yes 
survival, queen number and mating frequency 
Sex allocation Resource availability Yes Yes Yes 
Sex allocation Competition for mates between related males Yes Yes Yes 
Number of individuals trying to become reproductive Presence of old queens Yes Yes Yes 
Workers killing queens Presence of workers, reproductives or other queens Yes No No 
Exclusion of non-kin Colony membership Yes Yes Yes 


Data are taken from refs 12-16. 


mated mother if this provides a sufficient benefit) merely confirms, in 
a less general way, Hamilton’s original point: if the fitness benefits are 
great enough, then altruism is favoured between relatives. 
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Only full-sibling families evolved eusociality 


ARISING FROM M. A. Nowak, C. E. Tarnita & E. O. Wilson Nature 466, 1057-1062 (2010) 


The paper by Nowak et al. has the evolution of eusociality as its title, 
but it is mostly about something else. It argues against inclusive fitness 
theory and offers an alternative modelling approach that is claimed to 
be more fundamental and general, but which, we believe, has no prac- 
tical biological meaning for the evolution of eusociality. Nowak et al.’ 
overlook the robust empirical observation that eusociality has only 
arisen in clades where mothers are associated with their full-sibling 
offspring; that is, in families where the average relatedness of offspring 
to siblings is as high as to their own offspring, independent of popu- 
lation structure or ploidy. We believe that this omission makes the 
paper largely irrelevant for understanding the evolution of eusociality. 

Eusociality is not just any form of condition-dependent reproductive 
altruism as found in cooperative breeders, but the permanent division of 
reproductive labour. Clades where helpers became irreversibly eusocial 
(ants, some bees, some wasps, and termites’) are old, radiated into many 
subclades over evolutionary time, and achieved considerable ecological 
footprints. A recent comparative study’ showed that all hymenopteran 
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clades that fit the standard definition of eusociality* evolved from life- 
time monogamous ancestors**. This implies that high relatedness 
always preceded or coincided with eusociality, and contrasts with the 
contention of Nowak et al.’ that eusociality can evolve in any group with 
parental care, or that high relatedness arises after eusociality. 

Given that promiscuity is the most common mating system in 
animals, strict ancestral monogamy throughout eusocial clades 
implies that high relatedness was necessary for eusociality to evolve. 
Nonetheless, necessity does not imply sufficiency. Monogamous 
lineages may have remained solitary because the benefits of helping 
at the nest were insufficient to surpass independent breeding. This is 
elegantly captured by the ratio of the parameters b and c in Hamilton’s 
rule. In a number of ant, bee and wasp genera the high relatedness 
condition for eusociality has become secondarily relaxed via evolu- 
tionary elaborations such as multiple queen mating, but this has only 
occurred after worker phenotypes had specialized so that opting out to 
independent breeding had become selectively disadvantageous or 
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developmentally impossible’. Claiming (in their Supplementary Infor- 
mation, Part B) that it is far simpler to consider that advanced eusocial 
species just need more sperm’ muddles proximate and ultimate expla- 
nations”'®; many multiply-mating queens discard most of the sperm 
they receive’’’, indicating that sperm limitation cannot explain 
polyandry. 

We now also know that departures from high relatedness would 
almost certainly have prevented the evolution of eusociality if they had 
happened before sterile castes had become permanent’, that is, before 
reaching the point of no return to breeding independently’. A recent 
comparative study on birds'* showed that cooperative breeding is an 
unstable state that predominantly occurs in monogamous clades and is 
likely to be lost when parents become more promiscuous. This evidence 
is not merely correlative: differences in ancestral promiscuity between 
cooperative and non-cooperative species were found even before coop- 
eration arose, illustrating that monogamy preceded the evolution of 
helping and that helpers leave when relatedness incentives are reduced. 
This shows that high relatedness among siblings is critical along with the 
Hamiltonian b/c ratio but, as in the insects, relatedness is not sufficient 
because many monogamous birds are not cooperative breeders. 

In light of these reconstructions of the ancestral life histories of 
numerous social clades, it is surprising that the argument of Nowak 
et al.' about eusocial evolution starts by assuming that family structure 
can be replaced by any form of population structure. This assumption is 
puzzling given the lack of empirical evidence that this hypothetical 
‘parasocial’ route to eusociality'* (where same-generation individuals 
associate independent of relatedness) has produced a single extant clade 
with obligately eusocial workers. We believe that this renders Part A of 
the Supplementary Information of Nowak et al.’, and the arguments 
throughout the first two-thirds of the paper, largely irrelevant to the 
origin of eusociality. Part C of the Supplementary Information 
addresses the evolution of sterile workers within monogamous or clonal 
families, meaning that relatedness in these models is invariant. As a 
consequence, we believe that these models have nothing to say about the 
importance of relatedness in the evolution of eusociality beyond show- 
ing that costs and benefits are also important. This was already clear 
from Hamilton’s rule nearly half a century ago. 

It should give pause for thought that none of the long-recognized 
approximations of inclusive fitness theory raised in the paper was 
important enough to preclude kin selection theory from developing 
into a well-integrated network of complementary hypotheses with 
high predictive power for reproductive decision-making in real-world 
social organisms. In contrast, the abstractions of Nowak et al.' fail to 
provide any new predictions or questions; all they apparently have to 
offer is the truism that helpers are associated with longer-lived, fecund 
breeders. 


Jacobus J. Boomsma’, Madeleine Beekman’, Charlie K. Cornwallis?, 
Ashleigh S. Griffin?, Luke Holman’, William O. H. Hughes‘, 

Laurent Keller®, Benjamin P. Oldroyd? & Francis L. W. Ratnieks® 
1Centre for Social Evolution, Department of Biology, University of 
Copenhagen, 2100 Copenhagen, Denmark. 

e-mail: JJBoomsma@bio.ku.dk 

*Behaviour and Genetics of Social Insects Lab, School of Biological 
Sciences A12, University of Sydney, New South Wales, Australia. 
3Department of Zoology, University of Oxford, South Parks Road, Oxford 
OX1 3PS, UK. 

‘Institute of Integrative and Comparative Biology, Miall Building, 
University of Leeds, Leeds LS2 9JT, UK. 

°Department of Ecology and Evolution, Biophore, University of Lausanne, 
1015 Lausanne, Switzerland. 

°Laboratory of Apiculture and Social Insects, School of Life Sciences, 
University of Sussex, Falmer, Brighton BN1 9QG, UK. 


Received 19 September; accepted 17 December 2010. 


1. Nowak, M.A., Tarnita, C. E. & Wilson, E. O. The evolution of eusociality. Nature 466, 
1057-1062 (2010). 

2. Inward, D.J.G., Vogler, A. P. & Eggleton, P. A comprehensive phylogenetic analysis 
of termites (lsoptera) illuminates key aspects of their evolutionary biology. Mol. 
Phylogenet. Evol. 44, 953-967 (2007). 

3. Hughes, W. O.H., Oldroyd, B. P., Beekman, M. & Ratnieks, F. L. W. Ancestral 

monogamy shows kin selection is key to the evolution of eusociality. Science 320, 

1213-1216 (2008). 

Wilson, E. O. The Insect Societies (Belknap Press of Harvard Univ. Press, 1971). 

Hamilton, W. D. The genetical evolution of social behaviour, | & Il. J. Theor. Biol. 7, 

1-52 (1964). 

6. Alexander, R. D. The evolution of social behavior. Annu. Rev. Ecol. Syst. 5, 325-383 
(1974). 

7. Charnov, E. L. Evolution of eusocial behavior: offspring choice or parental 
parasitism? J. Theor. Biol. 75, 451-465 (1978). 

8. Boomsma, J. J. Kin selection versus sexual selection: Why the ends do not meet. 
Curr. Biol, 17, R673-R683 (2007). 

9. Mayr, E. Cause and effect in biology. Science 134, 1501-1506 (1961). 

10. Tinbergen, N. On aims and methods of ethology. Z. Tierpsychol. 20, 410-433 
(1963). 

11. Baer, B. Sexual selection in Apis bees. Apidologie (Celle) 36, 187-200 (2005). 

12. den Boer, S. P. A. et al. Prudent sperm use by leaf-cutter ant queens. Proc. R. Soc. 
Lond. B 276, 3945-3953 (2009). 

13. Wilson, E. O. One giant leap: How insects achieved altruism and colonial life. 
Bioscience 58, 17-25 (2008). 

14. Cornwallis, C. K., West, S. A., Davis, K. E. & Griffin, A. S. Promiscuity and the 
evolutionary transition to complex societies. Nature 466, 969-972 (2010). 


OF 


Author Contributions J.J.B. took the initiative for this contribution and wrote the first 
draft. All co-authors provided written and/or oral comments that helped shape the 
final submission. 

Competing financial interests: declared none. 


doi:10.1038/nature09832 


Kin selection and eusociality 


ARISING FROM M. A. Nowak, C. E. Tarnita & E. O. Wilson Nature 466, 1057-1062 (2010) 


Hamilton’ described a selective process in which individuals affect kin 
(kin selection), developed a novel modelling strategy for it (inclusive 
fitness), and derived a rule to describe it (Hamilton’s rule). Nowak 
et al.” assert that inclusive fitness is not the best modelling strategy, 
and also that its production has been “meagre”. The former may be 
debated by theoreticians, but the latter is simply incorrect. There is 
abundant evidence to demonstrate that inclusive fitness, kin selection 
and Hamilton’s rule have been extraordinarily productive for under- 
standing the evolution of sociality. 


Below we list a few examples of what has been learned from applying 
kin selection theory—there are thousands of others. (1) Organisms 
overwhelmingly direct costly assistance, and all true altruism, towards 
kin’. (2) Eusociality in insects originated in organisms with parental 
care and single mating, which means that relatedness among helpers 
and brood is generally at the level of siblings*. (3) Benefits that can 
make helping more profitable than reproducing independently often 
take the forms of either fortress defence (termites, naked mole rats, 
social shrimp, social thrips and aphids, and some ants) or life insurance 
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(wasps, bees, other ants)*°. (4) Sex ratios, worker egg laying, worker 
policing, caste conflict and other social interactions in eusocial insects 
are explained by kin selection theory*’. 

Kin selection evidence is not merely correlative. Numerous kin 
selection experiments manipulate relatedness in animals, plants and 
microbes. Other experiments manipulate costs and benefits and show 
that kin selection is predictive’. 

Nowak et al.” say that the haplodiploid hypothesis is not the only 
explanation for eusociality, but that has not been in dispute for some 
time. Haplodiploidy is not necessary for the evolution of eusociality, 
and is not the same as kin selection®’. 

Kin selection does not explain all social behaviour, but the claim 
that it does has never been widely accepted. There are cooperative acts 
that benefit the actor directly, and between-species mutualisms that 
must have direct benefits to evolve’. But only kin selection can explain 
true altruism’”®. 

Kin selection theory is still inspiring new research. Application of kin 
selection theory to microbes, including those causing human diseases, is 
expanding”. Kin selection is changing our views of imprinting’* and 
maternal-fetus diseases in humans”. 

Clearly kin selection is a strong, vibrant theory that is the basis for 
understanding how social behaviour has evolved. Perhaps the best 
examples come from kin recognition, a field that did not exist before 
Hamilton’s insights’. We are puzzled why Nowak et al.* would attack 
a body of research that has been exemplary as “a domain of empirical 
knowledge [that has] followed so closely and fruitfully upon an 
abstract theoretical idea.””’. 
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Inclusive fitness in evolution 


ARISING FROM M. A. Nowak, C. E. Tarnita & E. O. Wilson Nature 466, 1057-1062 (2010) 


For over fifty years, the evolution of social behaviour has been guided 
by the concept of inclusive fitness as a measure of evolutionary success. 
Nowak et al.’ argue that inclusive fitness should be abandoned. In so 
doing, however, they misrepresent the role that inclusive fitness has 
played in the theory of social evolution by which understanding social 
behaviour in a variety of disciplines has developed and flourished. By 
discarding inclusive fitness on the basis of its limitations, they create a 
conceptual tension which, we argue, is unnecessary, and potentially 
dangerous for evolutionary biology. 

The core argument of Nowak et al’. for abandoning inclusive fitness 
is its limited capacity to predict dynamics in evolutionary models. This 
is an old point, and one that was hotly debated in the early years of 
kin selection theory’. Inclusive fitness was developed by Hamilton to 
summarize a difficult frequency-dependent selection problem by using 
a simple maximization principle. Early work’ proved that the average 
inclusive fitness effect is maximized by behavioural evolution in family 
structured populations and that it provides the surface for the Wright’s 
adaptive topography (arguably one of the most useful tools that has 
ever been developed for understanding evolution). 

Hamilton’s great insight was that individual fitness is not maximized 
by social evolution when relatives are present, inclusive fitness is. The 
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idea that something other than the individual organism could be the 
fitness-maximizing unit was completely revolutionary at the time and 
opened new research areas that are still being developed, such as the 
study of transitions in units of evolution and individuality’. 

Today, inclusive fitness and evolutionary dynamics models are 
bridged and linked by the unifying concept of invasion fitness*"' 
(Fig. 1). Invasion fitness captures feedbacks between the evolution 
of social traits and the ecological structure of the evolving popu- 
lation’"'' (Fig. 2). Invasion fitness embraces average fitness and 
inclusive fitness maximization at evolutionary steady states>”?""', 
and further reveals the difficulty of reducing evolution to a simple 
maximization process”? (Fig. la). 

Under Hamilton’s rule, the condition for invasion of an altruistic 
allele involves a linear function of genetic relatedness. With eco- 
evolutionary feedbacks, Hamilton’s rule becomes part of a more 
complex and dynamic framework”'®"' in which relatedness between 
interacting individuals is a dynamic property of the population and an 
outcome of the models, rather than a pre-defined feature (which 
relatedness was in earlier models of inclusive fitness). Yet in this 
new framework the assumption of weak selection alone is often suf- 
ficient for Hamilton’s rule to predict accurately endpoints of altruism 
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evolution” (Fig. 1b). This framework has also begun to provide the 
testable predictions under competing hypotheses about relatedness 
and sociality that Nowak et al.' call for*""' (Fig. 2). 

Both inclusive fitness and average fitness maximization are general 
insights about the evolutionary process with great heuristic value, 
even though they rely on special conditions to predict evolutionary 
dynamics. Considerable progress in extending evolutionary dynamics 
models to more general ecological, behavioural and genetic scenarios 
has been guided by the inclusive fitness concept*®''’*"’°. By opposing 
‘standard selection theory’ and ‘inclusive fitness theory’, we believe 
that Nowak et al.’ give the incorrect (and potentially dangerous) 
impression that evolutionary thinking has branched out into conflict- 
ing and apparently incompatible directions. In fact, there is only one 
paradigm: natural selection driven by interactions, interactions of all 
kinds and at all levels. Inclusive fitness has been a powerful force in the 
development of this paradigm and is likely to have a continued role in 
the evolutionary theory of behaviour interactions. 
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Figure 1 | Fitness landscape and Hamilton’s rule. a, Fitness landscape in 
resident-mutant phenotypic space. Mutant invasion (inclusive) fitness is 
maximized along the dashed curve and the black circle is an evolutionarily 
stable phenotype. However, the asymmetrical sign structure implies that 
evolutionary dynamics do not obey an optimizing principle’’. b, Selection 
pressures arise from relatedness (red curve), physiological cost of altruism 
(blue) and change in space occupancy (black). Under weak selection, the latter 
is negligible (black curve close to zero) and the benefit is proportional to 
relatedness (see for example, equation (3) in ref. 10), which makes Hamilton’s 
rule a good approximation of selection'®''. The black circle indicates the 
evolutionarily stable phenotype found in a. 
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Figure 2 | Co-evolution of altruism and relatedness. Evolutionary 
trajectories (thin curves) start with zero altruism and converge to evolutionarily 
stable phenotypes (black circles). Relatedness is dynamic and co-evolves with 
altruism. Eco-evolutionary models can predict how environmental factors (for 
example, habitat viscosity) affect altruism evolution. Here, viscosity decreases 
across trajectories from right to left, and the string of black circles shows how 
evolved altruism and relatedness should co-vary along an environmental 
gradient of habitat viscosity. 


METHODS 


See ref. 10 for details on the model underlying Figs 1 and 2 (and others”*"! for 
further development). Altruism is a continuous character (with haploid inher- 
itance) evolving in a spatial population network. Invasion fitness is the growth 
rate of a self-structured mutant cluster. Figure la, b is based, respectively, on 
figures 6E and 4E in ref. 10 with mobility rate = 1 and habitat viscosity = 1/4. 
Figure la makes no special assumption on selection, interactions or population 
structure. Figure 2 is based on figures 5F and 7F in ref. 10 with mobility rate = 1 
and habitat viscosity from 1/4 (right) to 1/256 (left). Figures 1b and 2 only assume 
weak selection. 
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In defence of inclusive fitness theory 


ARISING FROM M. A. Nowak, C. E. Tarnita & E. O. Wilson Nature 466, 1057-1062 (2010) 


Arguably the defining characteristic of the scientific process is its 
capacity for self-criticism and correction’. Nowak et al.’ challenge 
proposed connections between relatedness and the evolution of 
eusociality*, suggest instead that defensible nests and “spring-loaded” 
traits are key, and present alternative modelling approaches. They 
then dismiss the utility of Hamilton’s insight that relatedness has a 
profound evolutionary effect’, formalized in his widely accepted 
inclusive fitness theory as Hamilton’s rule (“Rise and fall of inclusive 
fitness theory”). However, we believe that Nowak et al.” fail to make 
their case for logical, theoretical and empirical reasons. 

Logically, both in attacking inclusive fitness and in attempting to 
reinforce their own positions, Nowak et al.* cherry-pick examples 
and fail to distinguish necessary from sufficient causes’. Yes, there are 
hundreds of haplodiploid species that are not eusocial’. Yet, there are 
also hundreds of nest-making (diploid) birds, mammals and reptiles 
that are not eusocial. Moreover, if the non-eusocial, haplodiploid 
species pose a problem for inclusive fitness, then the fact that hundreds 
of them also make nests (including many living in communal or sub- 
social groups) does not support the proposed alternative. 

Theoretically, in promoting their modelling approach, Nowak et al.’ 
pose a false dichotomy between inclusive fitness theory and “standard 
natural selection theory”. They assert, we believe incorrectly, that 
inclusive fitness theory suffers from numerous ills (for example, “strin- 
gent assumptions”), yet their own models require stringent assump- 
tions, without the benefit of any generality. Indeed, although asserting 
that “relatedness does not drive the evolution of eusociality”, the 
authors do not present the critical test of removing the effects of related- 
ness in their model (for example, by randomly assigning daughters to 
nests). Thus, Nowak et al.* do not provide any basis for their core 
assertion, and available data on real biological systems*® directly con- 
tradict it. 

Empirically, Nowak et al.*, in our eyes, misinterpret relevant 
literature. Emphasizing progressive provisioning of food to immatures 
asa critical pre-adaptation (that is, a “spring-loaded” trait), they overlook 
taxa (for example, sweat bees) in which eusociality evolved repeatedly 
without progressive provisioning”®. It has been suggested that eusociality 
might rapidly evolve’, but the statement by Nowak et al.’ that studies of 
forced sociality in Lasioglossum bees show that solitary bees will divide 
labour “in foraging, tunnelling, and guarding” is incorrect. Lasioglossum 
hemichalceum is social (communal), not solitary'’®, and the solitary 
Lasioglossum figueresi was studied in artificial arenas, not nests, so it 
was impossible for bees to forage, tunnel or guard’ ' Moreover, the small 
carpenter bees that Nowak et al.’ cite are in a genus (Ceratina) that 
contains no known obligately eusocial species, and only one species in 
which faculative eusociality occurs at high frequency’, indicating that 
even if “spring-loaded” traits exist, Nowak et al.” have misidentified 
them. 

What is clear is that neither haplodiploidy, nests, nor “spring- 
loaded” traits is sufficient for the evolution of eusociality. However, 
the most recent comparative evidence supports the basic prediction of 
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inclusive fitness theory that, regardless of ploidy or the presence of 
nests or “spring-loaded” traits, high relatedness is key to the evolution 
of cooperative breeding and/or eusociality*®. Any serious attempt to 
dismiss inclusive fitness theory must address the results of these 
important comparative studies*® directly. 

Beyond its being completely integrated with “standard natural 
selection theory”’’, beyond extensive theoretical work showing that 
it is both flexible and robust!’, beyond the fact that available evidence 
supports its fundamental prediction that high relatedness is key for 
the evolution of eusociality*°, inclusive fitness theory has the virtue of 
making general, non-obvious predictions well beyond the issue of 
eusociality*®. Kin recognition and policing'*’°, mother-fetus con- 
flicts, and patterns of sex allocation (particularly in eusocial insects) 
stand out*’*'>. Collectively, those predictions have again and again 
been borne out in a vast comparative and experimental empirical 
literature (for example, refs 3-6, 14, 15) that Nowak et al.” nonetheless 
dismiss as “meagre” and “superficial”. Nowak et al.” present a pro- 
vocative essay, but in their apparent rush to discard inclusive fitness 
theory, they present an alternative that we believe to be deeply flawed. 
Although the continued scrutiny of accepted paradigms is an essential 
part of the scientific process, the reports’ of the fall of inclusive fitness 
theory have been greatly exaggerated. If anything, Nowak et al.” suc- 
ceed in reminding us of the elegance and power of Hamilton’s numer- 
ous insights and contributions’. 
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Our paper challenges the dominant role of inclusive fitness theory in 
the study of social evolution’. We show that inclusive fitness theory is 
not a constructive theory that allows a useful mathematical analysis of 
evolutionary processes. For studying the evolution of cooperation or 
eusociality we must instead rely on evolutionary game theory or 
population genetics. The authors of the five comments” ® offer the 
usual defence of inclusive fitness theory, but do not take into account 
our new results. 

The definition of inclusive fitness given by Hamilton’ is as follows: 


“Inclusive fitness may be imagined as the personal fitness which an indi- 
vidual actually expresses in its production of adult offspring as it becomes 
after it has been first stripped and then augmented in a certain way. It is 
stripped of all components which can be considered as due to the indivi- 
dual’s social environment, leaving the fitness which he would express if 
not exposed to any of the harms or benefits of that environment. This 
quantity is then augmented by certain fractions of the quantities of harm 
and benefit which the individual himself causes to the fitnesses of his 
neighbours. The fractions in question are simply the coefficients of rela- 
tionship appropriate to the neighbours whom he affects: unity for clonal 
individuals, one-half for sibs, one-quarter for half-sibs, one-eighth for 
cousins,...and finally zero for all neighbours whose relationship can be 
considered negligibly small.” 


The concept of inclusive fitness assumes that the fitness of individuals 
can be split into additive components caused by individual actions. This 
approach rests on specific assumptions, which need not hold for any 
particular evolutionary process. Therefore inclusive fitness theory is 
not a general description of natural selection. In Part A of our Sup- 
plementary Information’ we provide a mathematical analysis to prove 
this point. If there are non-zero selection intensities, or if there are 
synergistic interactions, or if there is complex population structure, 
then it is easy to find situations where personal fitness cannot be par- 
titioned into additive components as needed by inclusive fitness theory. 
Essentially, inclusive fitness theory requires fitness to be a linear func- 
tion of individual actions, but a full understanding of social evolution 
must take into account the nonlinearity inherent in biological systems. 

We distinguish between inclusive fitness theory and standard natural 
selection theory, because the latter does not require fitness to be split 
into additive components. We have shown that inclusive fitness theory 
is a proper subset of the standard theory and makes no independent 
predictions. Any effect of relatedness is fully captured by the standard 
approach. 

Hamilton’s rule states that cooperation can evolve if relatedness 
exceeds the cost to benefit ratio. If cost and benefit are parameters of 
individual actions then this rule almost never holds’*”. There are 
attempts to make Hamilton’s rule work by choosing generalized cost 
and benefit parameters"®, but these parameters are no longer properties 
of individual phenotypes. They depend on the entire system including 


population structure. These extended versions of Hamilton’s rule have 
no explanatory power for theory or experiment". 

Neither inclusive fitness theory nor any formulation of Hamilton’s 
rule can deal with evolutionary dynamics’’. This fact alone invalidates 
the claim that inclusive fitness theory “is as general as the genetical 
theory of natural selection”. 

Several aspects of our paper are misrepresented in the comments” °. 
One, we do not argue that relatedness is unimportant. Relatedness is 
an aspect of population structure, which affects evolution'’’. Two, 
we do not dispute the importance of kin recognition. Conditional 
behaviour based on kin recognition can be seen as a mechanism for 
the evolution of cooperation'’. Three, Part A of our Supplemen- 
tary Information’ is not a model for evolution of eusociality, but a 
mathematical framework that demonstrates the limitations of 
inclusive fitness theory. Four, Part C of our Supplementary Infor- 
mation’ provides a mathematical model for the evolution of eusoci- 
ality, which makes simple and testable predictions and explains the 
rarity of the phenomenon. Five, monogamy and sex ratio manipula- 
tion may be important for the evolution of eusociality; such ideas are 
best tested in the context of the explicit model that we propose. 

Abbot et al.’ claim that inclusive fitness theory has been tested in a 
large number of biological contexts, but in our opinion this is not the 
case. We do not know ofa single study where an exact inclusive fitness 
calculation was performed for an animal population and where the 
results of this calculation were empirically evaluated. Fitting data to 
generalized versions of Hamilton’s rule is not a test of inclusive fitness 
theory, which is not even needed to derive such rules. 

The limitations of inclusive fitness theory are also demonstrated by 
its inability to provide useful calculations for microbial evolution’*’’. 

Herre and Wcislo® have presented a one-sided account of cases in 
halictid eusociality, the details of which do not detract in the least 
from our argument. Halictid bees were not ignored as stated; we cited 
them three times. Furthermore, communal halictid bees are ‘social’ 
only in a primitive sense. They occupy a commons-like tunnel but 
build and defend their own personal cells as solitary bees'’. Herre and 
Wcislo® point out that the experiments of Wcislo’® were designed not 
to allow foraging, tunnelling, or guarding, but do not mention that 
these behaviours were tested in other experiments'’””°. Bees are mass 
provisioners, as Herre and Wcislo® say, and we should have used the 
phrase ‘defence and care of young with mass provisioning (bees) or 
progressive provisioning (others)’. We thank Herre and Wcislo® for 
pointing out this oversight. Primitively eusocial halictids nevertheless 
devote considerable care to the cells, guarding them and in many cases 
opening them to clean out waste. 

Various authors mention sex ratio theory, which we do not study in 
our paper. Nevertheless a precise understanding of sex ratio evolution 
is based on population genetics and does not require inclusive fitness 
theory. 
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BRIEF COMMUNICATIONS ARISING 


There is no support for the claim that evolution maximizes inclusive 
fitness. Nobody has offered a mathematical statement explaining what 
should be maximized and for which process. 

Hamilton’s work has stimulated much empirical research and has 
led to many measurements of relatedness. But we have shown that we 
cannot rely on inclusive fitness theory to describe how interactions 
among related individuals affect evolution. Inclusive fitness theory is 
neither useful nor necessary to explain the evolution of eusociality or 
other phenomena. It is time for the field of social evolution to move 
beyond the limitations of inclusive fitness theory. 


Martin A. Nowak’, Corina E. Tarnita? & Edward O. Wilson 

Program for Evolutionary Dynamics, Department of Mathematics, 
Department of Organismic and Evolutionary Biology, Harvard University, 
Cambridge, Massachusetts 02138, USA. 

e-mail: martin_nowak@harvard.edu 

?Museum of Comparative Zoology, Harvard University, Cambridge, 
Massachusetts 02138, USA. 
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sk many people what they are afraid of, and 

cancer — the big C — will often top the list. Some 

forms of cancer have become easily treatable. But 
in many cases, by the time doctors deploy the weapons of 
surgery, chemotherapy or radiation, the cancer has already 
progressed past the point where medical intervention can 
cure the condition. 

Many cancer specialists now contend that the best way to 
deal with cancer is to ensure it doesn’t develop in the first place. 
Just as the most effective way to deal with frostbite is not to 
amputate a toe but to wear warm socks and shoes beforehand, 
the same is true for cancer — the best solution is to enhance 
our defences and, as best we can, avoid carcinogens. 

A small subset of the cancer research community is 
focusing its attention on this goal. Chemopreventive drugs 
and vaccines could fortify our healthy bodies against 
future malignancies (see articles on page S5 and S8). New 
technologies are revealing precancerous lesions when they 
can be easily snuffed out (S14). Certain foods with anticancer 
properties can form part of a healthy diet (S22), while their 
prophylactic ingredients can be extracted or synthesized. 

While curing a patient is a heart-warming triumph; 
prevention merely maintains the status quo of good health. 
No surprise which activity gets the lion’s share of media 
attention — and funding. This Nature Outlook showcases the 
progress being made in cancer prevention despite its status as 
the poor relation of the cancer research world. 

We can only hope that the efforts reported here will 
make cancer as obsolete as the pock marks that not so 
long ago marred the faces of those who survived smallpox. 
We are pleased to acknowledge the financial support of 
the Janssen Pharmaceutical Companies in producing this 
Outlook. As always, Nature retains sole responsibility for 
all editorial content. 


Herb Brody 
Supplements Editor, Nature Outlook. 
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INTRODUCTION 


The prevention agenda 


Despite our relative wealth of knowledge about the causes of cancer, the disease persists 
— and the burden is worsening. Prevention demands political will, ample funding and a 


change in mindset. 


BY TIFFANY O’CALLAGHAN 

alf a century ago, more than 75% of 
H British men smoked; today, that fig- 

ure is closer to 20%. This drop has cut 
lung cancer deaths in middle aged men in the 
United Kingdom by as much as half. A similar 
trend, albeit less steep, is evident in other coun- 
tries where smoking has declined, including 
the United States 

And that’s not the only significant advance 
in preventing cancer. Screening has also made 
an impact. The pap smear, to detect precan- 
cerous cells in the cervix, has helped cut US 
cervical cancer mortality rates from 5.5 per 
100,000 in 1975 to just 2.4 in 2007. Antiviral 
vaccines are another success story: introduc- 
tion of the hepatitis B virus (HBV) vaccine in 
1982 cut chronic HBV infection rates among 
children in some countries from 15% to less 
than 1%, which has translated into reduced 
rates of liver cancer in adults. Hopes are high 
that vaccines against the human papillo- 
mavirus (HPV) will make similar inroads. 

In spite of these encouraging trends, cancer 
— inallits guises — continues to undermine 
global health. In 2008, there were 12.7 million 
new cancer cases and 7.6 million deaths accord- 
ing to the American Cancer Society (ACS), cost- 
ing the global economy nearly US$900 billion. 
By 2030, the World Health Organization 
(WHO) predicts we'll face more than 21 mil- 
lion new cancer cases and 13 million deaths 
each year at skyrocketing costs to society. 

The vast bulk of cancer research is trying 
to find treatments for people who are already 
sick. But approximately a third of cancers are 
caused by tobacco and at least a quarter are 
attributed to other lifestyle factors. The focus 
on cures only perpetuates the Sisyphean task 
of keeping cancer at bay. With all that we know 
about preventable causes of cancer, why is the 
incidence of cancer increasing? The answers 
are as complex and intertwined as the causes 
themselves. 


WILLPOWER REQUIRED 

The sources of cancer are manifold. It’s clear 
that there are environmental effects: second 
generation immigrants exhibit the disease pat- 
terns of their compatriots not of their ances- 
tors’. But there’s more to it than exposure. 
Although up to 90% of lung cancer is caused 
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by smoking, fewer than a sixth of smokers 
develop the disease. Some cancers are caused 
by faulty genes, such as BRCA1/2 in breast 
and ovarian cancer; other cancers are fuelled 
by hormones such as oestrogen, testosterone 
or insulin. And then there are pathogens: the 
WHO estimates that 6% of cancers in wealthy 
nations and 22% in low- and middle-income 
countries are caused by viruses such as HBV, 
HPV and hepatitis C virus (HCV), bacteria 
such as Helicobacter pylori and waterborne 
parasites. Lifestyle affects cancer risk, too. 
In the past two decades as waistlines have 
expanded, so has the evidence linking obesity 
with the risk of breast, endometrial, colorectal 
and other cancers. 

Yet even when the causes are understood, 
it is not easy to translate that knowledge into 
preventive actions. It was the 1950s when 
British epidemiologist Richard Doll proved 
smoking causes lung cancer, but it took dec- 
ades to whittle away at the cigarette culture. 
Smoking restrictions imposed in the past 10 years 
were the result of a slow, incremental gather- 
ing of medical data and political will. Even- 
tually “it became convincing to the public, at 
which point it was much easier to regulate,” 
says David Hunter, an epidemiologist at the 
Harvard School of Public Health. 

Smoking, however, is something of 
a special case. Building the political will 
for smoking bans in workplaces and public 


PREVENTION’S DECLINING SHARE 


Although the budget of the US National Cancer Institu 
share — devoted to prevention is in decline. 


facilities hinged on the health risk of second- 
hand smoke to non-smokers. Could similar 
regulations be introduced when “there is not 
a direct cause and effect between your behav- 
iour and my potential illness?” asks Hunter. 
Public health advocates can argue, for exam- 
ple, that obesity-related illnesses increase over- 
all healthcare costs, but the logical steps from 
eating habits to obesity and then cancer risk 
are less straightforward. “That’s a much more 
indirect case and it’s harder to make,” Hunter 
points out. 

Future cancer prevention strategies might 
curtail individual choice — be it mandating 
vaccinations, banning trans-fats or taxing 
unhealthy food. These are treacherous polit- 
ical waters. “They are hard decisions that will 
not be popular,’ says Arnie Purushotham, an 
oncologist at the Integrated Cancer Centre 
(ICC), King’s College London. 

The unpopularity of such policies is evident 
from recent examples. Despite widespread 
efforts to promote access to the HPV vaccine, the 
ACS estimates that fewer than one in four girls 
who begins the course of vaccinations actually 
finishes it — partly owing to the social stigma 
associated with a cancer caused by a sexually 
transmitted infection. Efforts like the 2006 trans- 
fats ban in New York City are often decried as 
‘nanny state’ meddling. Attempts to pass a 1% tax 
on sugary drinks in New York were ridiculed — 
in January 2011 the state's health commissioner 
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said the tax was off the table, for now at least. 
The broader notion of cancer as a prevent- 
able disease is not yet fully accepted. A study 
published in May 2010 found that the more 
local news coverage of cancer watched by 
Americans, the more likely they were to have 
fatalistic views of the disease’. Such ideas are 
even more problematic in a world of growing 
cancer risks, as developed countries export 
their bad habits. “Tobacco companies moved 
away from rich countries to poor countries,” 
says Peter Boyle, head of the International 
Prevention Research Institute in Lyon, France. 
Higher rates of smoking, obesity and alcohol 
consumption mean low-income countries 
will struggle, but without the infrastructure 
to cope. “The poor countries of the world are 
going to be absolutely hammered in the next 
couple of decades by the diseases that are com- 
mon in the developed countries now,’ says 
Boyle. And cancer is leading the charge. 


ELUSIVE FUNDING 

Prevention research costs money, but fund- 
ing decisions tend to be skewed in favour of 
developing treatments. Prioritizing prevention 
requires long-term thinking, yet government 
research goals can shift with each election. 
Ten years ago, 11.4% of the National Cancer 
Institute’s annual budget was specifically allo- 
cated to cancer prevention and control. Since 
then, that allocation has steadily declined 
(see Prevention’s declining share, page S2). 
According to a report by Purushotham and 
colleague Richard Sullivan for the G20 sum- 
mit in 2010, less than 4% of worldwide public 
funding for cancer research goes to prevention. 

Some of the reasons for this distribution are 
obvious. The need for treatment is urgent, and 
survivors often go on to champion the cause. 
Prevention lacks this powerful advocacy 
group. “There is nota grateful patient to pres- 
sure the politicians to increase funding for the 
disease they have or have been cured from,” 
says Hunter. In prevention, he adds, “the suc- 
cesses are virtual. With a cure, or even justa 
short increase in life expectancy, it feels more 
real for the public”. 

Moreover, prevention research “entails a 
very different form of research than setting up 
alab and getting some mice and putting some 
carcinogen on them,’ says Ian Magrath, at the 
International Network for Cancer Research and 
Treatment, a not-for-profit organization based 
in Brussels, Belgium. “The type of research 
you have to do takes a much more fuzzy form 
because it involves human behaviour and 
psychology,’ 

Industry financing models struggle too. 
Developing a new treatment can cost as much 

as US$1.3 billion. That 
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COUNTING THE COST OF CANCER 


The burden of cancer, calculated as the cost of years lost from ill-health, disability or early death, outweighs all 


other health concerns. 


Heart diseases 


$753.2 bn 


Cancer 


$895.2 bn 


much risk are you willing to take? Most peo- 
ple would say zero,” says Kenneth Kaitin, a 
pharmacologist at Tufts Center for the Study 
of Drug Development in Boston, Massachu- 
setts. Proving prevention is also harder and 
more time-consuming than proving treatment. 
“You have to have a large enough sample size 
of people who would eventually have cancer to 
prove that this is not 
just by chance,’ says 


“The poor Kaitin. “In essence 
countries of you're proving the 
the world are negative” 

going to be Timing has other 
hammeredinthe ramifications. Drug 
next couple of patents are filed in 
decades by the the early stages of 
diseases that clinical testing, and 
are commonin apply for 20 years 
the developed or so, depending on 


extensions. After 
approval, which 
can take 10 years or 
more, a company has limited time before the 
drug goes generic. Long trials will eat into 
that time, reducing ability to recoup invest- 
ment. And that is assuming they can negotiate 
reimbursement from insurers. Compared with 
drugs for treatment, Kaitin says, “it’s even more 
cumbersome and onerous for a company to try 
to justify reimbursement to prevent a disease.” 


countries now.” 


LOOKING FORWARD 

Despite these hurdles, cancer prevention 
advocates are pushing ahead. “You have to 
try to do something that is achievable,” says 
Purushotham. Many research organizations 
are starting to infuse a prevention ethos into 
their medical approach. At the ICC, preven- 
tion messages are being added to the patient 
consultation process. If a patient comes in 
with a lump that turns out to be benign, for 
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example, Purushotham’s team asks the 
patient about lifestyle factors that could 
elevate their cancer risk. The ICC also has a 
pilot programme underway for ‘speed dating’ 
between family doctors and cancer special- 
ists, enabling oncologists to inform physicians 
about the latest cancer detection and preven- 
tion research. 

In certain institutions, prevention is 
slowly moving up the list of cancer priori- 
ties. In 2009, of all the programmes at the 
Yale Cancer Center (YCC), the prevention 
programme received the largest slice of NCI 
funding. And to promote better collaboration 
between disciplines, researchers in the YCC 
have monthly meetings with colleagues they 
wouldn't otherwise meet — psychologists 
working with molecular geneticists, epide- 
miologists collaborating with clinicians — to 
share data and talk about new strategies. “We 
try to increase cross-talk,” says Yong Zhu, 
co-director of Yale’s cancer prevention pro- 
gramme. “We need to increase our communi- 
cation among different research groups.” 

So the message is slowly being heard. “A 
prevention agenda is critical to have any kind 
of impact on the disease for the future,” says 
Purushotham. As the global health community 
lays the foundation for future policies, perhaps 
gathering the necessary will for widespread 
cancer prevention is a matter of reminding 
ourselves of some age-old wisdom. As Thomas 
More, the sixteenth century philosopher, once 
wrote: “It is a wise man’s part, rather to avoid 
sickness than to wish for medicines.” m 


Tiffany O’Callaghan is a freelance writer 
based in London. 


1. Boyle, P, Levin, B. (eds) World Cancer Report 2008. 
(IARC Press, 2008). 

2 Niederdeppe, J. et al. Journal of Communication 60, 
230-253 (2010). 
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First line of defence 


Combinations of drugs are showing some promise as 
therapeutic agents that stop cancer before it starts. 


BY LAUREN GRAVITZ 


r he mice in Xiangwei Wu's lab at the MD 
Anderson Cancer Center in Houston 
seem to be cheating death. Despite 

being genetically pre-programed for colon 
cancer, they're staving off disease with a novel 
course of ‘cancer cleansing’ Every three weeks, 
Wu injects the mice with a drug combo that 
targets only the mutant cells, prompting the 
cells to self-destruct. There are no side effects, 
nor is there any visible damage. These mice are 
receiving the most cutting-edge therapy con- 
ceivable: short-term treatment for long-term 
prevention. 

Taking drugs to prevent cancer — rather 
than treat it — is not a new idea. In fact, over 
the past few decades a new specialty known as 
‘chemopreventior has grown up around the 
concept. It’s a tantalizing notion: swallow a pill 
that can thwart disease before it manifests. The 
reality, of course, may never be that simple. 


An untold number of genetic changes can 
trigger cells to become cancerous — predict- 
ing what those changes will be and who will 
get them is still incredibly difficult (see Por- 
tents of malignancy, page S19). Studying the 
changes as they occur is even harder, at least 
in humans. Researchers cannot investigate a 
developing disease until they know it’s there, 
which means that stopping it is a biomolecu- 
lar game of Whac-a-Mole. Complicating the 
problem is the thorny issue of people who feel 
fine starting a long-term drug regimen — one 
that can cause troublesome side effects or even 

put them at risk for another disease. 
The process of carcinogenesis can be exceed- 
ingly slow. “Cancer begins 20 years before a 
woman feels the lump in 
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cancer chemoprevention 35 years ago. That 
long span of time offers several opportuni- 
ties for intervention. There’s the mutation of a 
cell into something not-quite-healthy but also 
not-quite-cancerous. There’s the transforma- 
tion of a population of these precancerous cells 
into something that is immortal and grows 
unchecked. And then in most cancers, there's 
the metastatic movement of some of these cells 
away from the tumour, which establish a foot- 
hold elsewhere in the body. 

A chemoprevention agent that blocks 
the very first step would be best. But most 
researchers would consider a drug successful 
if it could stop the disease progressing from 
any stage to the next. 


QUENCHING INFLAMMATION 

Breaking the first link in the carcinogenesis 
chain is the ideal place to start: kill off any 
mutant cells that the immune system misses 
and do so before they have a chance to become 
cancerous. Yet this is a difficult proposition. 
Not only is it nearly impossible to observe 
these early changes in humans, but a drug for 
such early prevention would have to be abso- 
lutely benign in order to justify a near lifelong 
prescription. 

Many in chemoprevention are beginning 
to think that perhaps the best way to catch 
cancer is to target inflammation. Chronic 
inflammation appears to encourage tumours 
by prompting the growth of new blood vessels 
and a remodelling of the extracellular matrix 
— creating a prime setting for normal cell 
growth to turn malignant. 

Inflammation may be at the root of a host 
of serious ailments, from heart disease to 
diabetes. Stemming the tide of inflammation 
might prevent not only cancer but a number 
of other diseases. This theory led chemo- 
prevention researchers to turn to two drugs 
— celecoxib and aspirin — that target the 
cyclooxygenase enzymes (COX-1 and COX-2), 
which play key roles in inflammation and 
pain. Both of the drugs are non-steroidal anti- 
inflammatories (NSAIDs), which epidemio- 
logical studies suggest may reduce the risk of 
colon and other cancers. 

Aspirin inhibits COX-1, while celecoxib 
(Celebrex) inhibits COX-2. COX-1 is pro- 
duced in tissues throughout the body, and is 
known to mediate the production of prosta- 
glandins — chemical messengers that control 
a number of physiological functions, such 
as lowering blood pressure, regulating body 
temperature and controlling inflammation. 
COX-2, on the other hand, is strictly regulated 
and tends to spike during inflammation and 
other stress — an abundance of COX-2 has 
been linked to the growth and proliferation of 
cancerous and pre-cancerous cells. Inhibiting 
the COX pathways can alter cancerous and 
precancerous cells by decreasing blood vessel 
formation and cell growth. COX inhibition 
enhances a mutant cell’s ability to commit 
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suicide in a process known as apoptosis and 
enables the immune system to recognize and 
target the cells for destruction. 

Celecoxib — along with other COX-2 inhib- 
itors such as rofecoxib (Vioxx), since removed 
from the market — were developed as alterna- 
tives to aspirin, which can cause gastrointes- 
tinal bleeding after prolonged use. And, fora 
while, they seemed promising not just for pain 
but also for cancer prevention. The US Food 
and Drug Administration (FDA) approved 
daily doses of celecoxib for reducing colon 
cancer risk in people with a rare genetic disease 
called familial adenomatous polyposis (FAP), 
which causes dense outcroppings of intesti- 
nal polyps. Early reports indicated that daily 
doses of COX-2 inhibitors could decrease the 
risk of breast, skin and colon cancer in high- 
risk individuals. While COX-2 inhibitors are 
stilla promising drug for chemoprevention, no 
drugs that work through this pathway are now 
on the market. 

But it become apparent that prolonged 
use of COX-2 inhibitors increased the risk of 
stroke and heart attack, so the FDA banned 
most from the market. Celecoxib remained 
available until early February 2011, when the 
manufacturer voluntarily removed labeling 
for this use due to its inability to do follow- 
up studies. A number of cancer prevention 
researchers continue to pursue COX-2 inhibi- 
tors, trying to develop compounds with the 
same inflammation- and cancer-fighting 
effects as celecoxib but without the risk of 
cardiovascular problems. 

Aspirin comes with its own set of risks. 
“People die from it; says Leslie Ford, associate 
director for clinical research in the cancer pre- 
vention division of the National Cancer Insti- 
tute (NCI) in Washington. “There are about 
16,000 deaths a year from gastro-intestinal 
bleeds in people taking aspirin. But these risks 
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aren't preventing clinical trials, and the evi- 
dence is mounting that aspirin can decrease 
a person’ risk of colon cancer, as well as lung, 
prostate and even brain cancer. 

Other systemic, inflammation-targeted 
drugs may also help ward off a variety of can- 
cers. Some studies suggest that statins — devel- 
oped for cholesterol management — might 
disrupt the growth and proliferation of cancer 
cells. Reports suggest that medications taken 
to increase insulin sensitivity in type 2 diabetes 
could lower risk for several types of cancers, 
including head, lung and neck. Such drugs 
include metformin, pioglitazone and rosigli- 
tazone (although the latter has recently been 
linked to heart problems). 

Precisely how or why these drugs might 
aid cancer prevention remains a mystery. 
“Every time I think I have a specific agent for 
a specific pathway, when it’s tested, I find all 
sorts of other activities that may play into it as 
well” says Ernest Hawk, head of the Division 
of Cancer Prevention & Population Sciences 
at MD Anderson Cancer Center. “This really 
gets at the question: do we really ever know the 
mechanism of anything?” 


CURBING PREMALIGNANT CELLS 

Currently, most chemoprevention aims to pre- 
vent premalignant cells from completing the 
process of carcinogenesis. These agents have 
clear cellular targets and are intended to treat 
people at high risk — those with a family his- 
tory of disease or a known genetic mutation, 
or who are already known to have precan- 
cerous cells. 

Because different cancers evolve in various 
ways, a one-size-fits-all approach is not on the 
horizon. “One of the biggest challenges of this 
field is that it’s really not one field,’ says Eva 
Szabo, a researcher in the cancer prevention 
division of the NCI. “Before we understood the 
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complexity that is cancer, the thought was that 
we could have a generalized strategy that could 
prevent the transformation of cells, or could 
arrest progression.” Now, she says, it’s apparent 
that lung or breast or colon cancers can take 
multiple forms. And because the pathogenesis 
is different, the prevention has to be different, 
too. 

Most of the medications approved by the 
FDA for the purpose of treating precancer- 
ous cells or reducing cancer risk (see Table 
opposite) fall into this category. Women who 
fall into high-risk categories for breast can- 
cer, for example, can take raloxifene (Evista) 
or tamoxifen (Nolvadex), both of which have 
been shown to cut a womans risk of estrogen- 
receptor positive breast cancers by as much as 
50%. And five different creams and ointments 
have been approved to prevent skin lesions 
from developing into skin cancers such as 
squamous cell carcinoma. 

Xiangwei Wu's colon cancer-prone mice 
belong in this category, too. Wu's two-drug 
combination takes advantage of mutations 
specific to precancerous cells in the colon, 
prompting only those mutant cells to self- 
destruct. “Because most preventive drugs don't 
get rid of the bad cells, people have to be on 
them continuously for a long time,” says Wu. 
Such is the case with raloxifene and tamoxifen. 

Wu's mice seem to be experiencing the best 
of all possible outcomes: they receive intermit- 
tent therapy that kills emerging cancer cells, 
but benefit from breaks between doses to 
recover from any side effects. In addition, the 
two agents — tumour necrosis factor-related 
apoptosis-inducing ligand (TRAIL) and all- 
trans-retinyl acetate (RAc ) — work synergisti- 
cally. Most researchers see this as important in 
preventing a rebound of drug-resistant popu- 
lations. TRAIL has already proven promising 
for treating cancer because it appears to leave 
healthy cells unharmed while inducing cancer 
cells to self-destruct. In combination with RAc, 
Wu has found, TRAIL can induce precancer- 
ous cells to commit suicide, too (see Cancer- 
stopping combo). 

Another two-drug combo aimed at colon 
cancer is one of the most promising chemo- 
prevention prospects in the pipeline. Low 
daily doses of sulindac, an NSAID, along with 
difluoromethylornithine (DFMO) appear to 
do far more good together than either does 
alone — and with minimal toxicity, according 
to researchers at the University of California, 
Irvine. “We wanted to find the lowest dose of 
each at which we could find a relevant effect 
in the colon,’ says Frank Meyskens, director of 
the university's Chao Family Comprehensive 
Cancer Center. 

The combination is one of push and pull. 
Cancerous cells have a hard time regulating the 
metabolism of polyamines and so have abnor- 
mally high levels of these compounds, which 
healthy cells use for growth and development. 
DFMO prevents polyamine synthesis, while 
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Drug Brand Cancer Year first Target / mechanism Dosing (how long, how often) Original 
name(s) type approved manufacturer 

Tamoxifen Nolvadex Breast 1998 Selective estrogen receptor modulator | Daily, for 5 years AstraZeneca 
Istubal (SERM) 
Valodex 

Raloxifene Evista Breast 2007 SERM Daily, duration unknown Eli Lilly 

HPV vaccine Gardasil Cervix 2006 Elicit immune response to prevent 3 doses over the course of 6 months Merck & Co 
Cervarix Vulva infection by the most common GlaxoSmithKline 

Vagina/Anus cancer-causing types of HPV 

Porfimer sodium | Photofrin Esophageal | 2003 Lodges in precancerous cells and Single injection, followed by light Axcan 

+ photodynamic upon exposure to certain light therapy three days later. 

therapy (PDT) & produces an active form of oxygen 

omeprazole that kills nearby cancer cells Can be repeated after 90 days 

Fluorouracil Efudex Skin 1970 Interferes with DNA synthesis and Apply to affected areas twice daily Valeant 
Fluoroplex leads to cell death until lesions are gone, as long as 10-12 
Carac weeks. 

Diclofenac Solaraze Skin 2000 Exact mechanism is unknown Apply to lesion twice daily, for 60-90 PharmaDerm 

sodium 3% days 

5-aminolevulinic Levulan Skin 1999 Solution kills precancerous cell when Apply a topical solution to lesion, DUSA 

acid + PDT* exposed to light * then single photodynamic therapy 

treatment 14 to 18 hours later. 

Imiquimod Aldara (5 %) | Skin 2004 Enhances immune response and Zyclara: Applied topically for two 2 Graceway 
Zyclara promotes apoptosis weeks, then a 2 week break Pharmaceuticals 
(3.75%) Aldara: Applied twice a week for 16 

weeks 


sulindac triggers cells to purge it. Together, in 
low doses, the two compounds work to lower 
polyamines in precancerous tissue; the result 
is fewer precancerous growths, which are 
risk factors for colon cancer. In clinical trials, 
Meyskens’ team showed that after three years 
of therapy, the combined treatment reduced 
the occurrence of polyps by 70% and a greater 
than 90% reduction in advanced adenomas— 
the ones most likely to go on to develop cancer. 
The researchers found a company, Cancer Pre- 
vention Pharmaceuticals, to take the treatment 
through phase III trials into the market. 


PREVENTING METASTASIS 

With most solid-tumour cancers, the biggest 
danger is not the tumour itself but its ability 
to metastasize. While the primary tumour 
continues to grow, rogue cells break off, work 
their way into the blood stream, and move on 
to colonize other areas of the body. Metastases 
represent an advanced accumulation of genetic 
damage, which are too many mutations for 
conventional drugs to target at once. 

“Across all cancers, over time, there is an 
almost exponential increase in the number of 
genetic mutations,’ says Raymond Bergan, a 
specialist in preventive oncology at Northwest- 
ern University medical school in Chicago. “We 
do a pretty good job of making a drug that can 
hit a target. But making one drug that will hit 
two different targets is very difficult.” 

Bergan is aiming for a target that prevents 
growth and metastasis of a primary tumour. 
His lab has been investigating genistein, a soy 
isoflavone that has been sold as a nutritional 
supplement for years. Bergan’s group is putting 
the compound through its paces in the lab, and 


it is showing promise in trials for preventing 
— even reversing — the metastatic process of 
prostate cancer. 

For cancer to metastasize, tumour cells must 
detach from their neighbours. This decreased 
adhesion is at least partially mediated by the 
enzyme focal adhesion kinase. Studies show 
that genistein blocks the activation of this 
enzyme so that prostate cancer cells remain 
tethered to the tumour. 

Genistien also prevents mobile prostate can- 
cer cells from invading tissue. As healthy cells 
grow and divide, the enzyme matrix metal- 
loproteinase-2 (MMP-2) helps break down 
extracellular membrane proteins to make 
way for new growth. One of the proteins it 
degrades, however, is first in line for attack by 
invading cancer cells. Researchers have found 
that higher concentrations of MMP-2 corre- 
late with poor prognosis. Genistein counter- 
acts this by targeting a protein that increases 
MMP-2 concentrations. By binding to and 
inhibiting this protein, it can prevent produc- 
tion of MMP-2. 

In phase II clinical trials, Bergan has shown 
that genistein can decrease MMP-2 in human 
prostate tissues — he’s now looking at whether 
this prevents cancer cells from moving beyond 
the prostate, thereby reversing the cancer’s 
evolution into metastatic disease. Following 
up on Bergan’s research, Seema Khan, also at 
Northwestern University, is studying whether 
genistein — as part of a mixture of soy isofla- 
vones — might prevent the spread of breast 
cancer. Her results, she says, are far less con- 
clusive and hint that in younger women, this 
treatment could lead to a slight increase in cell 
proliferation. 


Indeed, uncertainty and ambiguity are 
the norm in cancer prevention research. “I 
believe that there are enough instances where 
chemoprevention has been deleterious that it 
calls into question how much we understand,’ 
says John Potter, a senior advisor at the Fred 
Hutchinson Cancer Research Center in Seattle. 
“If were to write a prescription for the field, Td 
want to match risk and benefit” 

Tailoring the right therapy to the right risk 
profile will help. “Most people think of cancer 
prevention like preventing polio: you get a one- 
time shot and you never have to worry about 
it again,” says Powel Brown, chair of Clinical 
Cancer Prevention at MD Anderson. “But it’s 
more likely to be akin to taking anti-cholesterol 
medicine for the prevention of heart disease.” 
That is, you'll be taking the medication indefi- 
nitely — and while there may be side effects, 
they will be acceptable to reduce the risk of a 
potentially fatal disease. 

Indeed, if scientists and clinicians are able to 
establish a cancer prevention system as well- 
tuned as that for heart disease, it would count 
as a massive success. For now, researchers in 
the field will have to look to the mice in Wu's 
lab, which are revealing one possible path to 
a cancer-controlled — if not cancer-free — 
future. = 


Lauren Gravitz is a writer in Los Angeles. 
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Know your enemy 


Vaccines are arguably our greatest medical achievement. 
But to what extent can they help prevent cancer? 


BY MICHAEL EISENSTEIN 


( “aot operates like a well-disguised 
saboteur. Occasionally it slips up by 
displaying unusual proteins, tripping 

immunological surveillance systems that are 
checking for abnormal growth. For decades 
now, scientists have tried to exploit this vul- 
nerability with therapeutic vaccines — injec- 
tions of tumour-associated proteins that 
essentially hang a ‘Wanted’ poster, helping 
immune cells recognize and kill cancer cells. 

After a string of expensive and dispir- 
iting defeats, therapeutic cancer vaccines 
recently registered their first big win. In April 
2010, the US Food and Drug Administration 
(FDA) approved Provenge (sipuleucel-T) 
— a mixture ofa patient’s own cells incubated 
with a protein expressed by 95% of prostate 
tumours. This was not an unequivocal victory, 
however. “On average, patients live about four 
months longer,’ says Martin Kast, a cancer vac- 
cine specialist at the Norris Comprehensive 
Cancer Center at the University of Southern 
California (USC) in Los Angeles. “It certainly 
measures up to many chemotherapeutics, but 
there's still a long way to go.” 

The problem is that as cancer develops, it 
disrupts the immune system to prevent an 
effective counter-attack. Accordingly, Kast and 
other researchers have been refocusing their 
efforts towards developing cancer vaccines as 


S8 | NATURE | VOL 471 | 24 MARCH 2011 


preventative measures, before cancer sets its 
traps. “You don’t wait to get polio before 
taking the polio vaccine — you vaccinate 
prior to the engagement of the pathogen,” 
says Vincent Tuohy, an immunologist at the 
Cleveland Clinic in Ohio. With therapeutic 
vaccines “we were asking the immune system 
to go in and heroically eliminate this large and 
mature tumour load, and that’s a problem. 
If you want to get rid of a disease, you do it 
prophylactically”. 


CATCHING CANCER 
So far, the best examples of anticancer vac- 
cines are those that thwart cancer-causing 
infections. For example, vaccination against 
the hepatitis B virus (HBV) offers lasting pro- 
tection against liver cancer: an estimated 54% 
of hepatocellular carcinoma (HCC) cases are 
attributable to HBV. A 20-year study in Taiwan 
demonstrated that vaccination reduces the risk 
of developing HCC by roughly 70% (ref. 1). 
Vaccines against human papillomavirus 
(HPV, see main image) can potentially have 
even greater impact. HPV is responsible for 
almost all of the half-a-million cases of cer- 
vical cancer worldwide, 


> NATURE.COM as well as 60,000 cases 
for the latest of anal, genital and 
research on HPV throat cancer. The two 
vaccines FDA-approved vac- 
go.nature.com/aaDgPM  cines, Gardasil (Merck, 


approved 2006) and Cervarix (GlaxoSmith- 
Kline, approved 2009), both demonstrate 
remarkable efficacy in preventing infection 
by HPV strains 16 and 18, which account 
for the lion’s share of HPV-associated can- 
cer. “I’m optimistic that we're going to have 
long-term protection,’ says John Schiller, 
head of the Neoplastic Disease Section at the 
National Cancer Institute in Bethesda, Mar- 
yland, and a co-inventor of both vaccines. 

At least a dozen other strains of carcinogenic 
HPV elude these vaccines, however. The reach 
ofa vaccine is determined by its valency, which 
is the number of different targets it presents 
to the immune system. Gardasil is effective 
against four strains of HPV, Cervarix two. 
Merck is looking to up the ante: its V503 vac- 
cine, now in a phase III clinical trial, covers 
nine strains. Meanwhile, Schiller is developing 
an ‘omnivalent vaccine, derived from a differ- 
ent viral protein, which could offer complete 
protection. In tissue cultures, Schiller says, 
this vaccine “prevents infection by all the 
HPV types we've ever tried”. Schiller antici- 
pates that such a broad antiviral vaccine could 
be routinely given to younger patients of both 
sexes, greatly expanding coverage — as of 
2008, only 18% of girls aged 13 to 17 years in 
the United States had received a full course of 
HPV vaccinations. Sanofi-Pasteur is likely to 
advance such an omnivalent approach into 
clinical trials in the near future. 

Scientists are also starting to make pro- 
gress in developing vaccines against Heli- 
cobacter pylori, a bacterium linked to 60% 
of the one million or so cases of stom- 
ach cancer worldwide. In 2008, a team 
led by Peter Malfertheiner, a gastroenterol- 
ogist at the Otto-von Guericke Universitat 
in Magdeburg, Germany, showed that a can- 
didate vaccine — now under development at 
Novartis — was both safe and capable of rais- 
ing a strong immune response against selected 
proteins’. “These antigens are key players in 
the pathogenesis of H. pylori infection,’ says 
Malfertheiner, “and we found a very significant 
systemic response.” His team plans to expose 
healthy volunteers to a moderately virulent 
strain of the bacteria to test the vaccine’s 
protective capacity. 


FRIEND OR FOE 

Most cancers — including major killers such 
as lung and colorectal cancer — are not caused 
by infections. In these cases, prophylactic 
vaccines must target essential proteins that 
the tumour needs to thrive. Fortunately, the 
past decade has seen great progress in both 
cataloguing tumour proteins and develop- 
ing genetically modified mouse models that 
closely mimic human cancer progression. 
Pier-Luigi Lollini, a molecular oncologist 
at the University of Bologna in Italy, reports 
early promising results from vaccinating mice 
against HER2/neu, a protein over-expressed 
in many breast tumours’. “We can prevent 
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tumour onset, and the ongoing carcinogenic 
process fuelled by HER2, in mice for most of 
their adult life” says Lollini. His focus is purely 
on animal studies, but colleagues Guido Forni 
and Federica Cavallo are pursuing a parallel 
HER2/neu vaccine strategy in humans that 
could soon enter clinical trials. 

Most cancer proteins are, to a certain extent, 
produced by healthy cells and as such enjoy 
a privileged ‘self’ status that protects them 
against immunologic attack. By incorporat- 
ing carefully selected adjuvant molecules that 
enhance the immune response, vaccines can 
break this inherent tolerance and turn immu- 
nity against these self-proteins — without the 
nasty effects seen in autoimmune diseases. 
For example, a 2010 study by Tuohy and col- 
leagues showed striking success in a mouse 
model of breast cancer by targeting the milk 
protein a-lactalbumin*. This protein is nor- 
mally expressed only during late pregnancy 
and lactation, but Tuohy notes that expression 
is also common in newly formed tumours. 
“One of the things they do is make inappro- 
priate proteins like a-lactalbumin,’ says Tuohy. 
In fact, vaccinated mice achieved 100% protec- 
tion against breast cancer, provided they were 
dosed before tumours began to develop. 
Tissue damage and inflammation were limited 
to the breast tissue of nursing animals. This 
should not be a problem for humans, as the 
highest risk breast cancer patients are generally 
past childbearing age. “97% of women ‘retire’ 
their breasts from lactation after 40, and that’s 
exactly the age range where 95% of breast can- 
cers occur,’ says Tuohy. 

Some researchers contend that broader 
protection could be possible with vaccines 
that target many different tumours. “You 
could potentially come up with multi-antigen 
approaches that cover 90% or even almost 
100% of populations of different tumour 
types,’ says Mary Disis, head of the Tumor Vac- 
cine Group at the University of Washington in 
Seattle. Immunologist Olivera Finn’s group 
at the University of Pittsburgh has worked 
extensively with one such candidate, mucin-1 


VACCINE-PREVENTABLE CANCERS 


(MUC1). “About 80% or more of all human 
cancers aberrantly express this protein,’ says 
Finn. “The immune system sees that abnor- 
mal expression and generates a response.” This 
response on its own is too weak to prevent 
cancer onset, but Finn and colleagues have 
developed a MUCI vaccine that prepares the 
immune system in advance of tumour forma- 
tion. Early trials have demonstrated a strong 
and potentially protective immune response 
in more than 50% of patients. Finn’s group is 
conducting a clinical trial of MUC1 in Pitts- 
burgh with patients at high risk for colorectal 
cancer — one of the few prophylactic cancer 
vaccine trials currently underway. 


TRIALS AND TRIBULATIONS 

Prophylactic cancer vaccine trials face 
numerous obstacles, including the long- 
term endpoints needed to determine preven- 
tion in cancer-free individuals. Schiller’s work 
with cervical cancer vaccines has benefited 
from the fact that lesions arising from HPV 
infection are a powerful predictor of cancer 
risk. Gardasil and Cervarix “haven't formally 
been demonstrated to prevent the cancer’, he 
says, but preventing 98% of HPV-associated 
lesions represents a “good surrogate marker 
for protection” 

Finn and other researchers are using simi- 
lar pathological features to select individuals 
likely to acquire malignancies in the near term. 
Finn’s MUCI colon cancer prevention trials, 
for example, focus on patients with advanced 
adenomatous polyps — unusual lumps on the 
colon wall. Such growths, explains Finn, repre- 
sent “the latest stage of premalignant changes, 
most proximal to colon cancer”. Likewise, 
Kast’s team at USC vaccinates mice shortly 
after they develop precancerous growths in 
their prostate’. This approach straddles the line 
between therapy and prevention: the vaccines 
flag existing abnormalities but nevertheless 
train the immune system to block the advance 
of full-blown cancer. Timing appears to be the 
crucial element that separates a failed thera- 
peutic vaccine from a successful prophylactic 
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one. “If you vaccinate early on, before there 
is actual cancer but while carcinogenesis is 
underway, you have a chance,” says Kast. 
Results from his studies in mice are promising. 
“Instead of dying between 6 and 12 months, 
most mice were still alive at 1.5 years, and they 
were dying of old age, not prostate cancer.” 

Even with a head start from preselecting high- 
risk patients, human studies will be a lengthy 
and expensive slog — a major deterrent for 
many companies and funding agencies. This 
is especially problematic for infections such 
as H. pylori, which primarily affects the devel- 
oping world. “The calculation of the cost- 
benefit ratio is critical” says Malfertheiner. 
“The developing world does not have the money 
to buy vaccines, whereas the developed world 
has good therapeutics that can treat infection.” 
Even existing vaccines such as Gardasil remain 
a costly proposition for poor nations, although 
improved formulation and low-cost generics 
may address this in the near future. 

Exposure for cancer vaccines also remains 
a problem. The first dedicated conference, 
scheduled for March 2011 at Arizona State 
University in Tucson, was abruptly cancelled 
in February. "We were described as being too 
far ahead of the curve," says Kast. However, a 
symposium on preventative cancer vaccines in 
April 2011 at the annual meeting of the Ameri- 
can Association for Cancer Research in Florida 
should help stimulate interest. For research- 
ers, promising results from early stage multi- 
antigen studies speak for themselves. “When I 
first got into the field of tumour immunology 
15 years ago, I remember saying, ‘I don't think 
Tl ever see a vaccine to prevent cancer’ recalls 
Disis. “I may have to eat my words.” m 


Michael Eisenstein is a freelance journalist in 


Philadelphia. 
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Current vaccines against human papillomavirus and hepatitis B virus have already made an impact on the incidence of certain cancer types. Future vaccines can 


potentially prevent millions of different cancer cases. 
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PERSPECTIVE 


The big C — for 
Chemoprevention 


Drugs to prevent cancer are clearly possible despite some 
early missteps, says Michael B. Sporn. Restoring the 
cooperative ethos of decades past will help get us there. 
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hen a complex system starts to 
dysfunction, it is generally best to 
fix it early. The alternative often 


means delaying until the system has degen- 
erated into a disorganized, chaotic mess 
— at which point it may be beyond repair. 
Unfortunately, the general approach to can- 
cer has ignored such common sense. The 
vast majority of cancer research is devoted 
to finding cures, rather than finding new 
ways to prevent disease. 

The results of these skewed priorities are 
plain to see: forty years after President Rich- 
ard Nixon declared war on cancer, the death 
tolls from most common forms of cancer 
in the United States have not fallen. It’s true 
that for some cancer types, mortality rates 
(adjusted for population size) have dropped 
during these decades, but there are huge, 
unhappy exceptions: mortality rates for lung 
and pancreatic cancer have stayed level since 
1970, and the total number of US deaths 
each year from those diseases has doubled’. 

Looking at these discouraging statistics, 
it is clear that something needs to change. 
We have been looking at the very nature of 
cancer in the wrong way. Breast cancer 
doesn't begin when a lump is first felt or 
detected by mammogram. All the com- 
mon epithelial cancers (lung, colorectal, 
breast, prostate, pancreas and ovary), which 
account for the majority of deaths, have 
a long latency period — often 20 years or 
more. By the time they are clinically detect- 
able, the cells in such carcinomas may har- 
bour hundreds of mutations in different 
genes’. These cells provide no simple, 
single target for therapy. In contrast, 
during the long latency period, there is 
ample opportunity to use multi-functional, 
multi-targeted preventive drugs that block 
the development of invasive and metastatic 
disease. 

That's the basic idea of cancer chemopre- 
vention (see First line of defence, page $5): 
to arrest or reverse the progression of pre- 
malignant cells towards full malignancy, 
using physiological mechanisms that do 
not kill healthy cells. In experimental ani- 
mals, it is now possible to prevent the onset 
of cancer in almost all the common organs 
in which human carcinoma occurs. Even 
more importantly, chemoprevention has 
now been validated in people. One class of 
drugs, known as selective estrogen receptor 
modulators (SERMs), can deliver as much 
as a five-fold reduction in incidence of 
estrogen receptor-positive breast cancer 
in women. These compounds — most 
notably tamoxifen, raloxifene and laso- 
foxifene — have the added benefit of sup- 
pressing osteoporosis’. Fenretinide, for 
which we have 15 years’ worth of data, 
provides significant prevention of breast 
cancer in premenopausal women*. Two 
anti-androgenic agents, finasteride and 


dutasteride, have been shown to be effective 
at reducing incidence of prostate cancer in 
long-term clinical trials”. 

And yet we have a paradox: within the 
world of clinical oncology, chemoprevention 
of cancer is perceived to be a failure. As a 
result of some poorly designed and executed 
clinical trials over the past decade, scepti- 
cism abounds about the practicalities of 
chemoprevention. This harsh assessment 
is the conventional wisdom among groups 
as diverse as the pharmaceutical industry, 
the hospital establishment, the insurance 
industry, women’s advocacy groups and the 
clinical oncology community itself. Of par- 
ticular disappointment to many advocates of 
chemoprevention has been the general lack 
of enthusiasm from large pharmaceutical 
firms, as exemplified by the recent decisions 
of two major companies to curtail further 
development of lasofoxifene and arzoxifene, 
another highly promising SERM*. Many 
factors have contributed to this negativity, 
including difficult regulatory approvals, 
duration of patent protection and the omni- 
present fear of liability in treating suppos- 
edly healthy people with drugs. 

But attitudes toward chemoprevention 
need to be re-examined. Most fundamental 
is the bizarre misperception that people are 
‘healthy’ until they have actual symptoms 
of invasive cancer, the corollary being that 
it is unwise and perhaps unethical to give 
a preventive drug to a healthy person. In 
reality, however, a person harbouring a pre- 
malignant lesion is not healthy, in spite of the 
absence of symptoms. Many of these people 
will go on to develop life-threatening can- 
cers. The barn in which hay is smoldering 
before it bursts into flames is nota safe place. 

Another canard is that cancer preven- 
tion efforts are not cost-effective. The argu- 
ment is that the number of lives saved with 
a preventive drug would be too small with 
respect to the total number of people who 
need treatment. But this is a curious perspec- 
tive. The number of houses destroyed by fire 
is trivial compared with the total number of 
houses, and yet almost every homeowner 
insures against fire. The conceptual problem 
here is that everyone doesn't die of cancer 
ina short period; this is a lifetime problem. 

There is a simple answer: we should stop 
doing clinical chemoprevention trials in 
large populations of people at relatively low 
risk, and instead focus on cohorts at the 
highest risk. There are many such groups: 
women with BRCA mutations that can 
lead to breast and ovarian cancer, people 
with premalignant pancreatic lesions and 

those with severe pre- 
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groups will provide much more definitive 
results and with much less effort. 

More broadly, the elevation of cancer 
prevention requires several actions. First, 
we need a massive educational effort to 
encourage the general public — not just 
special interest groups — to support pre- 
vention efforts. This has been very success- 
ful in cardiology; indeed, the pharmaceutical 
industry spends huge amounts of money 
on educational and advertising efforts to 
promote chemoprevention of cardiovas- 
cular disease with statins and anti-platelet 
agents. Unfortunately, these companies 
seem unwilling to similarly promote cancer 
chemoprevention. 

In addition, we need to be vigilant about 
the safety of the preventive drug testing regi- 
men. Thus, we should build in ‘rest periods’ 
in clinical chemoprevention trial design. 
Many drugs used for chemotherapy have 
severe toxicities if used long term, so rest 
periods are necessary and routinely used. 
Although, as a class, chemopreventive drugs 
are much less toxic, rest periods would make 
relatively safe drugs even safer. Moreover, 
drugs need to show extensive efficacy in 
animal experiments before undergoing 
human trial — wisdom that was forgotten 
in the failed clinical trials of beta-carotene, 
selenium and tocopherol. 

Regulatory accommodations will also 
help. The Food and Drug Administration 
still forbids the use of two experimental 
drugs in clinical prevention trials, in spite of 
the fact that there is overwhelming evidence 
that combinations can be much safer and 
more effective than single agents’. Further 
appreciation and understanding of the con- 
cept of ‘risk is also essential. For new drugs, 
the proper comparison is not risk versus 
benefit but rather risk versus risk® — that is, 
the risk of doing nothing (which may have 
a deadly outcome), versus the risk of taking 
a preventive drug for long periods. Oncolo- 
gists could follow the lead of cardiologists, 
who have developed a handy scorecard that 
numerically quantifies a patient's risk®. 

On the basic science front, we must 
develop new multifunctional drugs that 
aim at entire networks in the body, rather 
than single targets’. We need further stud- 
ies on the importance of epigenetics’® and 
the tumour microenvironment’ to develop 
chemopreventive drugs. The tumour 
microenvironment, with all of its stromal 
and inflammatory cells, is an essential part 
of a carcinoma. Major advances over the 
past decade in these areas are leading to the 
development of important new drugs for 
cancer prevention*”. 

So the challenges are numerous and 
daunting. There is great interest in person- 
alized medicine (a wonderful goal) — but 
how can we do this successfully if the param- 
eters we assess are exclusively genetic? Our 
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environment, which continually changes, 
is reflected in epigenetic changes , inflam- 
matory cells and the paracrine mediators 
they produce, as well as oxidative stress. 
All of these nongenetic parameters in turn 
can have profound effects on the structure 
and function of the genome. The ultimate 
justification for a preventive approach to 
control of cancer is that cancer prevention 
is an opportunity to provide a higher quality 
of symptom-free and pain-free life to peo- 
ple, rather than waiting until someone has 
invasive and metastatic cancer with all of 
its attendant suffering for both patient and 
family. 

We can look to the past for guidance. 
Fifty years ago, a unique spirit of intense co- 
operation found cures for acute childhood 
leukaemia and Hodgkin's disease, two previ- 
ously fatal conditions. This triumphant work 
was achieved through highly collaborative 
efforts that tested multiple combinations 
of drugs. In the case of leukaemia, it took 
many years of research at multiple institu- 
tions to find the proper mix and dosage of 
vincristine, amethopterin, 6-mercaptopu- 
rine and prednisone (VAMP) that eventu- 
ally enabled medical researchers Tom Frei 
and Emil Freireich and their team at the 
National Institutes of Health (NIH) to find 
a truly effective combination therapy. This 
effort involved not only the NIH, but teams 
at universities, as well as at several major 
pharmaceutical companies. A similar multi- 
group effort shortly thereafter led to the con- 
quest of Hodgkin's disease. 

Although we still have many cooperative 
groups for clinical trials, the all-hands-on- 
deck spirit that promoted the cure of child- 
hood leukemia and Hodgkin's disease has 
largely disappeared in an increasingly com- 
petitive world. Regulatory and legal issues, as 
well as academic competitiveness and com- 
panies’ perceived need to protect intellectual 
property, impede cooperation. To make sub- 
stantial progress toward cancer prevention, 
we need to regain this lost ethos. = 


Michael B. Sporn is professor of 
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School in Hanover, New Hampshire. 
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EPIGENETICS 


Unravelling the cancer code 


There’s more to the genetic causes of cancer than sequence mutations. This added 
complexity could offer scientists an opportunity to tackle cancer even earlier. 


BY VICK] BROWER 


at causes a cell to turn cancerous? 
Ever since the discovery of onco- 
genes in 1989, the prevailing the- 


ory is that mutated genes drive the process: a 
predisposition to cancer is somehow written 
in our genetic code. This depressing portrait 
might have made cancer prediction easier, 
but hindered attempts at cancer prevention. 
Fortunately, the story became a lot more com- 
plicated. 

It was geneticist Bert Vogelstein at Johns 
Hopkins University in Baltimore, Maryland, 
who clarified the role of that first oncogene 
as well as several others that followed; he is 
one of the originators of the notion of cancer 
as a genetic disease. Recently, while examin- 
ing biopsies from ovarian cancer patients, he 
discovered that more than half of the tumour 
samples had mutations in the ARIDIA gene’. 
Yet ARIDIA does not directly stimulate the 
formation of tumours. “What we did not 
expect,’ says Vogelstein, “was that this gene 
is involved in determining the epigenetic 
changes that can lead to tumours.” That is, 
the carcinogenic effect of ARID1A occurs by 
encouraging changes in gene expression levels 
and not in DNA sequence. 

Vogelstein’s findings were the tip of the 
iceberg. The Cancer Genome Project, an 
international effort to sequence the genomes 
of a number of different cancer types, has 
also found that “an incredibly high number 
of mutations actually affect epigenetics and 
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epigenetic regulators’, says Jean-Pierre Issa, 
an epigeneticist at the MD Anderson Cancer 
Center in Houston, and a researcher on the 
project. And Vogelstein’s insight into ARIDIA 
is a clue to the pathways that connect genetics 
and epigenetics in cancer. “This closes the 
loop, as we find that even genetic lesions 
are controlled by epigenetic mechanisms,” 
explains Issa. 

Such findings are leading scientists to realize 
that the root cause of cancer is more compli- 
cated than inherited or acquired genetic muta- 
tions. “We used to think of cancer in binary 
terms — genetic mutations, or no muta- 
tions — but that’s not the case anymore,’ says 
David Sidransky, an oncologist at Johns Hop- 
kins. Researchers are finding that epigenetic 
changes frequently precede and can induce 
genetic mutations that cause cancer. If these 
early epigenetic alterations can be detected 
and reversed, it might be possible to prevent 
certain cancers. 


BEYOND THE HUMAN GENOME 
The best known epigenetic modification is 
methylation — whereby a methyl group (CH;) 
attaches to a portion of DNA. Methylation ofa 
gene reduces or stops its expression (see DNA 
methylation patterns, page S13). Patterns of 
methylation can be inherited from the mother 
or acquired during life. If the genetic code is 
the hardware for life, the epigenetic code is 
software that determines how the hardware 
behaves — and as such it can be rewritten. 
The Human Genome Project (HGP) aimed 


to decode life’s hardware. Scientists hoped that 
one result would be discoveries of disease- 
causing genetic mutations. But mutations like 
those to BRCA 1/2, and are highly predictive 
of cancer risk, were found to be the exception 
rather than the rule. Indeed, the HGP helped 
confirm that underlying most common dis- 
eases are hundreds if not thousands of genetic 
mutations that vary from person to person. 
“Even before the rough draft of the HGP was 
completed in 2000, it had become clear that 
it is impossible to understand the genetics of 
cancer without epigenetics,’ says Issa. 

The HGP drew attention away from the nas- 
cent study of cancer epigenetics, which began 
in the mid-1980s at Johns Hopkins in the lab 
of cancer biologist Stephen Baylin. He noticed 
that cancer cells contained regions of DNA 
with increased methylation and hypothesized 
that if a tumour suppressor gene was hyper- 
methylated, its activity would decrease or stop 
entirely — just as if it were a genetic mutation 
— allowing the tumour to flourish. In other 
words, Baylin reasoned, this epigenetic change 
would produce the same result as a genetic 
mutation. 

Firm evidence came in 1994. Baylin and 
his colleague, oncologist James Herman, were 
investigating renal cell carcinoma (RCC), 

the most common type 
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tumour-suppressor gene (VHL), which hob- 
bles the gene’s ability to express the tumour 
suppressing protein. Baylin and Herman 
showed that 20% of the remaining patients 
with the non-inherited form of RCC did not 
have a mutation in VHL. Their genes were 
silenced not by a mutation but rather by hyper- 
methylation’. 

The following year, in collaboration with 
Sidransky’s lab at Johns Hopkins, Baylin and 
Herman showed that human cancers com- 
monly arise when a particular tumour supp- 
ressor gene, known as p16, is hypermethylated. 
Moreover, in many cancers including RCC, 
epigenetic and genetic mutations often work 
in tandem: one of the two copies of a tumour 
suppressor gene is inactivated by genetic muta- 
tion, while the other is hypermethylated. This 
finding “convinced us that epigenetic abnor- 
malities could play an important driving role 
in cancer — and we and many others have 
been pursuing this possibility ever since,” 
says Baylin. 

The move from a purely genetic to an epige- 
netic model is crucial for prevention strategies. 
As numerous gene therapy trials have shown, 
itis very difficult to treat a genetic disease by 
re-activating the dormant, mutated gene or 
by replacing it with a non-mutated one. “Epi- 
genetic changes are reversible, and therefore 
have an edge over genetics,” says Mukesh 
Verma, an epigeneticist at the National Cancer 
Institute's division of cancer control and pop- 
ulation sciences in Bethesda, Maryland. Fur- 
thermore, epigenetic changes in cancer occur 
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A common alteration in cancer cells is 
hypermethylation of tumour suppressor genes. 
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suppressor gene is transcribed as normal. 
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Tumour suppressor gene is hypermethylated 
and is not transcribed, leading to loss of its 
normal cellular function. 
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before genetic mutations. “If you can prevent 
methylation of those tumour suppressor genes, 
you might have a valuable prevention strategy,” 
says Baylin. 


THE ENVIRONMENTAL LINK 

Epigenetics has also provided clues that link 
environmental factors with cancerous genetic 
changes. Changes in methylation can be 
detected in the blood of cancer-free individu- 
als who smoke and eat high-fat diets, and these 
changes have been shown to precede genetic 
mutations’. More recently, Karl Kelsey, a 
molecular epidemiologist at Brown University 
in Providence, Rhode Island, has uncovered 
independent associations between epigenetic 
patterns in breast cancer tumours, the tumour 
size, alcohol consumption and folate intake’, 

A prime candidate at the interface of envi- 
ronment and genetics is chronic inflammation, 
which is known to precede the development of 
numerous types of precancerous lesions — and 
indeed certain cancers themselves, including 
oesophageal, liver and colon cancers. Inflam- 
mation has been linked with increased DNA 
methylation in otherwise healthy looking tis- 
sue. Issa calls chronic inflammation “a truly 
epigenetic phenomenon”. 

Long-term inflammation may result from 
infection with Helicobacter pylori or hepatitis 
C virus, or from autoimmune diseases such 
as ulcerative colitis (a form of inflammatory 
bowel disease). People with ulcerative colitis 
often develop colon cancer at a younger age 
— for example in their 50s — than the 60 to 70 
year average age of onset. “All of the epigenetic 
changes that occur in colon cancer, specifically 
DNA hypermethylation and gene silencing, are 
accelerated — and appear in the inflamed tis- 
sue prior to actual cancer,’ says Baylin. 

Half the world’s population is infected with 
inflammation-causing H. pylori, yet stomach 
cancer afflicts barely 0.03% of those. “There 
must be something in the body itself that 
makes it react differently to infection,” says 
Emad El-Omar, a gastroenterologist at the 
University of Aberdeen in Scotland. El-Omar 
is investigating whether genetic variation can 
influence this response 

Genetic polymorphisms are normal genetic 
variations within a population that can subtly 
raise or lower each person's levels of a particu- 
lar protein. While everyone within the normal 
population produces the same pro- and anti- 
inflammatory chemicals (or cytokines) an 
individual’s particular levels vary according 
to genetic make-up. Certain polymorphisms, 
El-Omar hypothesized, might tip the balance 
towards chronic inflammation and then to 
cancer. 

El-Omar found that polymorphisms in the 
inflammation-related IL-1B and transform- 
ing growth factor (TNF)-a genes determine 
the levels of circulating IL-1B and TNF-a 
pro-inflammatory cytokines. People with a 
genotype disposed to higher levels of these 
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cytokines have an increased risk of developing 
stomach cancer following H. pylori infection. 
El-Omar has made similar discoveries in other 
cancers. In colorectal cancer, for example, he 
discovered there is an inflammatory environ- 
ment in and around premalignant lesions. 
Within this region he found nine differentially 
expressed genes linked with inflammation, 
including those responsible for IL-8 and the 
CXCL- family of cytokines’. 

Others researchers are making similar con- 
nections. For example, there is evidence that 
pro-inflammatory prostaglandin endoper- 
oxide synthase°and TGF-f’ are both signifi- 
cantly associated with increased risk of colon 
cancer. Although the evidence linking inflam- 
mation to epigenetics and cancer is mounting, 
the underlying mechanism of the association 
and the value of screening for these polymor- 
phisms remain less clear. 


POTENTIAL FOR REVERSAL 

Drugs and dietary substances that alter epige- 
netic pathways are currently being tested. Dur- 
ing his research on RCC, for example, Baylin 
and colleagues were able to reverse hyper- 
methylation of the VHL gene with the drug 
5-azacytidine. Trials of demethylating drugs 
as adjuvant treatments to prevent lung cancer 
recurrence are underway. If successful, preven- 
tion trials are the next logical step. “We need 
five- and ten-year survival data with current 
drugs to be sure there are no secondary effects 
before we give them to reasonably healthy 
people for prevention,” says Issa. He sees a dif- 
ferent source for the first wave of preventive 
medications. “I would bank on discovering more 
‘gentle’ approaches to epigenetic manipulation 
for cancer prevention — be they natural prod- 
ucts, existing drugs with a good safety records, 
or even vitamins or diet” 

Epigenetic changes also have potential util- 
ity as biomarkers (see Portents of malignancy, 
page S19). Being able to read the methylation 
pattern of certain genes could give scientists a 
heads-up for people on the brink of develop- 
ing cancer. 

Slowly the importance of the epigenome 
in cancer development is being appreciated. 
“Geneticists are hugely more aware of the 
importance of epigenetics in the development 
of cancer,” observes Baylin. When it comes 
to cancer prevention, the future could lie in 
arresting the reversible epigenetic changes 
before irreversible mutations take hold. m 


Vicki Brower is a freelance writer in New York. 
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UTR CANCER PREVENTION 


EARLY DETECTION 


Spotting the 
first signs 


The sooner acancer is found, the better. New technologies are 
pushing thelimits of detection — towards the frontier of prevention. 


BY NEIL SAVAGE 


ne day, a few years hence, a patient hav- 
() ing a routine check-up might do little 
more than blow into a small machine 
at the doctor’s office and, within a couple of 
minutes, be told whether there are any early 
signs of cancer. For another patient, a routine 
blood test to monitor cholesterol might present 
an opportunity to check for stray cells from 
tumours too small to spot. A dermatologist, 
instead of eying a mole and perhaps slicing it 
off to biopsy, could instead peer at it through 
a machine to instantly tell whether it is malig- 
nant or benign. 
These, at least, are the visions of researchers 


developing technologies to detect the early 
signs of cancer. Better screening — looking for 
signs of cancer in people with no symptoms, 
as opposed to diagnosing suspected cancer 
— increases the odds that doctors will find 
cancer at its earliest stages when chances of a 
cure are higher. Screening has already reduced 
cancer deaths: the US National Cancer Insti- 
tute (NCI) estimates that colonoscopies can 
lower mortality from colorectal cancer by at 
least 60%, and the National Lung Screening 
Trial recently found that computed tomog- 
raphy scans of heavy smokers could cut lung 
cancer deaths by as much as 20%. Researchers 
are exploring a new suite of potential screen- 
ing methods that could one day join or even 


S14 | NATURE | VOL 471 | 24 MARCH 2011 


supplant today’s colonoscopies, mammograms = 
and pap smears. If some of these approaches 8 
can be shown to prevent cancer deaths and cut 
costs, they stand a good chance of becoming i 
part of regular patient care. 


LIGHT PROBES 

Many researchers are trying to improve on 
existing techniques such as endoscopy, deliv- 
ering images from inside the body through 
fiber optics. Engineers at Duke University, 
North Carolina, for instance, have designed 
an optical system to search for premalignant 
changes in patients with Barrett's esophagus, 
in which stomach acid alters the cells lining the 
esophagus. The condition more than doubles 
the risk of esophageal cancer. Unlike conven- 
tional endoscopy, the Duke technique, called 
angle-resolved low-coherence interferometry, 
images structures beneath the surface ofa cell 
for a sort of optical biopsy. Adam Wax, one of 
the Duke engineers, says looking at the basal 
layer of the epithelium, about 300 micrometers 
beneath the surface, seems most diagnostically 
useful. The system splits infrared light into two 
beams, and compares how far each travels to 
determine how deep it penetrates into the cell. 
Measuring the angle at which light bounces off 
cellular structures reveals the size of structures 
at increasing depth. The resolution is high 
enough to distinguish a normal-sized nucleus, 
about 10 micrometers in diameter, and a larger, 
precancerous one at least 13 micrometers. 

Wax says his enhanced endoscopy could 
provide better targets for biopsies — and, even- 
tually replace biopsies altogether. According 
to the NCI, esophageal cancer causes nearly 
15,000 deaths in the United States each year. 
“We hope that by contributing this tool we'll 
be able to shift that number downwards — the 
way it’s gone with colonoscopy,’ says Wax, who 
has launched a company, Oncoscope, to raise 
funds for clinical trials. 

A similar light-based technique, optical 
coherence tomography (OCT) — could detect 
non-melanoma skin cancer below the surface 
of the skin, where standard visual exams can't 
see. Where Wax aims to get a precise measure- 
ment of cell structures, OCT provides images 
that doctors can examine. OCT — already 
used by ophthalmologists to examine the 
inside of the eye, also uses interferometry to 
image intracellular structures so doctors can 
see if they’re abnormal. A British company, 
Michelson Diagnostics, is developing a hand- 
held OCT scanner to detect non-melanoma 
skin cancer below the surface of the skin, 

“We're very good at seeing where the lesion 
is,’ says biomedical engineer Gordon McKen- 
zie, Michelson’s medical applications direc- 
tor. “What were doing now is gathering the 
evidence of whether we're seeing a cancer or 
a precancer.” He says the machine, VivoSight, 
is comparable in both appearance and cost to 
the ultrasound machines found in obstetri- 
cians’ offices. He hopes that the device, now 
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in clinical trials, will reduce the number of 
biopsies of non-malignant lesions, and make 
the biopsies that are performed more accurate. 
Right now, he says, as many as 80% of patients 
who have a malignant mole removed need 
more tissue excised after tests find that some 
cancerous tissue may have been left behind. 
Eventually, he says, their scanner might help 
find new lesions in the colon and cervix as well 
as the skin. 


NANOPARTICLES AND MOLECULES 

Telling the precancerous from the cancerous 
is useful, but the earlier it is possible to detect 
aberrant cellular changes, the closer that comes 
to being equivalent to prevention. “Diagno- 
sis is very nice, but screening is much more 
important,’ says Hossam Haick, a chemical 
engineer at Technion Israel Institute of Tech- 
nology in Haifa. Haick has built an array of 
nanosensors that, he says, detect biomarkers 
of certain cancers years before they develop. 
He bases this assertion on histological studies 
showing that cells producing these molecules 
eventually become cancerous. 

Haick’s nanosensors measure volatile 
organic compounds released by cells into the 
bloodstream and then exhaled — a sort of can- 
cer breathalyzer. The array consists of about 
40 gold nanoparticles attached to molecules 
sensitive to various organic compounds such 
as alcohols, benzenes and alkenes. The patient 
blows into a bag, and as the breath passes 
through the array any volatile organic com- 
pounds bind to their complementary sensors 
and emits a signal. Software probes the pattern 
of any signals to get a signature of a type and 
stage of cancer (see Telltale cancer breath) . By 
testing patients with known cancers, Haick 
identified signatures for lung, breast, colorectal 
and prostate cancer. He is working on finding 
signatures for ovarian, liver and gastric cancer, 
and has formed a company to commercialize 
the technology. 

One challenge in early detection is pick- 
ing up a hint of cancer from tumours still too 
small to show up under current screening tech- 
niques. Mehmet Toner, a biomedical engineer 
at Harvard Medical School and Massachusetts 
General Hospital, is building a sensor chip that 
can find a single circulating tumour cell (CTC) 
ina billion or more blood cells. A blood sample 
is injected into the chip, along with antibod- 
ies that bind to proteins on the surface of the 
cancer cells. Also attached to the antibodies 
are tiny magnetic beads. Once suspect cells 
are tagged with those beads, they can easily be 
collected by other magnets for closer examina- 
tion. Toner says the chip, which he is devel- 
oping with Johnson and Johnson subsidiary 
Veridex, should prove to be far more sensitive 
than previous techniques. 

In addition to early detection, Toner says, 
the blood test is attractive because it’s easier 
— and cheaper — than something like a colon- 
oscopy. Colonoscopy is recommended for 


TELLTALE CANCER BREATH 
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Everyone’s breath contains certain chemicals (¢.g., alcohols, benzenes and alkenes). People with early stage 
cancer have different concentrations of these compounds in their breath than people who are cancer free. 


Each sensor in an array attracts 
a particular molecule found 
in the breath. 


Electrodes consist of 
gold nanoparticles. 


everyone over 50 years of age, but Toner says 
the CTC chip could act as a screen to rule out 
some patients—and perhaps to rule in younger 
people if the chip raises any issues. The aim 
of the Veridex collaboration is to manufac- 
ture 10,000 to 20,000 of these magnetic-bead 
chips over the next two years, then use them in 
clinical trials to see how well they detect CTCs. 
Toner notes that it is still an open question 
whether these cells, undetectable by current 
methods, will ever develop into cancers that 
can kill. Whereas today’s techniques may find 
cancer too late, an overly sensitive test might 
cause unnecessary alarm. “It could be that can- 
cer is something we all live with, but [current] 
tests are such that we only look at it when it 
becomes a problem,” he says. 


SEARCH FOR RELEVANCE 

In fact, such a risk of overdiagnosis is a major 
potential problem with early detection tech- 
nologies, says Sanjiv Sam Gambhir, at the 
Canary Center at Stanford for Cancer Early 
Detection. “Our goal isn’t to find all cancers 
early,’ he emphasizes. “Our goalis to find early 
relevant cancers — that is, cancers that will go 
on to kill” That will require biologists to work 
along with the technology developers — both 
to understand the significance of what the tests 
find and to identify new biological targets to 
refine the detectors. Gambhir, for instance, 
talks about identifying prognostic biomarkers 
— proteins that appear only in patients with 
the most aggressive tumours. 

Gambhir specializes in molecular imag- 
ing, in which a molecule that binds to a spe- 
cific biological target is labelled , often with a 
radioisotope, so that it shows up more clearly 
in existing imaging technologies. His group 
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created a probe that binds to tumour cells, but 
instead of a radioactive label, they attached 
a microbubble — a tiny, gas-filled sac that 
reflects sound waves and shows up brightly 
in an ultrasound image. Another approach 
relies on Raman light scattering, in which a 
small fraction of the photons reflecting off 
a surface shift their wavelength in a specific 
way. Gambhir injects probes labelled with 
gold nanoparticles, then uses a colonoscope to 
find their Raman signal. This technique could 
improve the sensitivity of colonoscopy, which 
is good at identifying polyps but tends to miss 
flat precancerous lesions. 

Although the ultimate goal is to develop 
techniques to screen the general population 
for early signs of malignancy, it is easier to 
tell if the new tests work when researchers 
already knowa patient has cancer. Thus many 
techniques are first being developed for use in 
the easy cases — staging tumours or identi- 
fying sites for biopsy in patients with known 
malignancies. The technological progression, 
says Toner, is to go “from metastatic cancer to 
diagnosed but early-stage cancer to high-risk 
groups free of cancer to the wider population”. 
Once researchers prove that the new technolo- 
gies work in that wider population, physicians 
will have at their disposal a powerful set of 
early warning systems that can give us bad 
news in time for it to be not the worst news. = 


Neil Savage is a writer based in 
Massachusetts. 
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LIFESTYLE 


Breaking the 
cancer habit 


It’s the simple things in life that sometimes mean the most to 


people— and do the most good. 


BY CASSANDRA WILLYARD 


rug companies spend billions of dol- 
D« developing high-tech therapies 
to deploy against cancer. Yet millions 
of people still die of the disease each year. 
Preventing cancer appears to be a far simpler 
proposition. The most effective steps to curb 
cancer are low-tech: get people to stop smok- 
ing and lose weight. Smoking, long known to 
bea risk factor, and obesity, more recently rec- 
ognized as one, together account for roughly 
half of all cancer cases. But as anyone who has 
tried to kick the habit or lose a few pounds 
knows, both steps are easier said than done. 
To have a real impact on cancer, health offi- 
cials need to address the “causes of causes of 
cancer,’ says Michael Marmot, an epidemiolo- 
gist at University College London (UCL). That 
means finding new ways to curb smoking and 
obesity. “Simply telling people not to get fat is 
not very effective,” says Marmot. Policies that 


make unhealthy lifestyles more inconvenient 
and expensive, while making healthier ones 
easier and cheaper, can have a large impact. 
So can medical interventions. For example, 
researchers are developing new ways, such as 
the nicotine vaccine, that might help smokers 
quit for good. 


EAT LESS, MOVE MORE 

Many studies in the past couple of decades 
have examined the impact of dietary factors 
on cancer risk (see The omnivore’s labyrinth, 
page S22). How much we eat may be as impor- 
tant as what we eat. But overeating may be more 
important than choosing the right foods Ina 
study that followed 900,000 adults for 16 years, 
researchers at the American Cancer Society 
(ACS) found a significant association between 
body mass index and higher mortality owing 
to cancers of the oesophagus, colon, liver, gall- 
bladder, pancreas, kidney, breast (in women), 
uterus, cervix, ovary, prostate and stomach (in 
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men). In 2003, the ACS estimated that excess 
weight could account for one in seven cancer 
deaths among men and one in five among 
women in the United States . 

Since 2003, the evidence linking body 
weight and cancer has grown stronger. One 
of the most comprehensive efforts to exam- 
ine this relationship came from the World 
Cancer Research Foundation and the 
American Institute for Cancer Research in 
Washington DC. At UCL, Marmot led a team 
of 21 scientists to review some 7,000 related 
studies, and again found a convincing link 
between excess weight and many cancers. 
They found a link between excess body fat and 
a variety of cancers. They recommended that 
people strive to be as lean and active as pos- 
sible, and that they avoid sugary drinks. 

Researchers have yet to fully understand 
how being overweight can cause cancer. 
The mechanism likely depends on the type 
of malignancy. For example, abdominal fat 
presses on the stomach, causing acid to splash 
up into the oesophagus. That acid leads to 
tissue damage, which can lead to oesopha- 
geal cancer. Oestrogen, produced by fat cells, 
appears to play a role in endometrial cancer 
and breast cancer in postmenopausal women. 
“Obese women have about three times the cir- 
culating level of oestrogens as lean women,’ 
says Walter Willet, a prominent nutrition 
researcher at Harvard University. 

Obesity can also cause the body to become 
less responsive to insulin. To compensate, 
the pancreas churns out more of this potent 
growth factor. Researchers posit that excess 
insulin can cause cancer cells to proliferate. 
Diabetics treated with the drug metformin, 
which lowers insulin levels, appear to have a 
lower risk of many cancers, including pan- 
creatic and breast. It is unclear, however, 
whether metformin’s anti-cancer activity is 
related to insulin’. 

Recent research has confirmed the com- 
plementary effect: losing weight can make a 
person less prone to cancer. Researchers in 
Sweden, for example, tracked two groups of 
2,000 overweight men and women and found 
that bariatric surgery cuts the risk of cancer in 
women by 42% (ref. 3). Another study found 
that gastric bypass cut the risk of developing 
cancer by 24% and the risk of dying of can- 
cer by 46% (ref. 4). (In both studies, this trend 
held only for women, not men, which sug- 
gests weight loss may have a particularly large 
impact on breast and endometrial cancers*.) 

Regular exercise can cut cancer risk as 
well— and not just because it often leads to 
reduced weight®. “We've been able to tease 
out the individual effects of being physi- 
cally active versus being overweight,” says 
Christine Friedenreich, a cancer epidemiol- 
ogist and leader of population health research 
at University of Calgary in Canada. One the- 
ory is that active people tend to digest food 
faster. “Being physically active may decrease 
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the chance of any carcinogenic products 
that happen to be going through the colon 
to have contact with the mucosal lining,” she 
explains. Similarly, better lung function in 
fit individuals could limit their exposure to 
airborne carcinogens. Hormones might also 
be a factor. Friedenreich and her colleagues 
found that postmenopausal Canadian women 
who walked for about three hours a week had 
lower oestrogen levels after a year than sed- 
entary women’. 


HEALTHY INFLUENCES 

Providing consumers with better nutritional 
information may help them choose a more 
nutritious diet and avoid the obesity that 
raises their cancer risk. Toward that end, David 
Katz, director of the Yale-Griffin Prevention 
Research Center, has developed an algorithm 
for ranking foods according to their overall 
healthiness. The formula, NuVal, accounts for 
ingredients including salt, vitamins, saturated 
fat, fibre, sugar, cholesterol, protein, as well as 
overall calorific value. It assigns each food a 
score between one anda hundred®. The higher 
the value, the healthier the food (pineapple 99, 
butter cookie 1). According to Katz, NuVal has 
been used to score more than 90,000 foods. 

Economic incentives can also change nutri- 
tional behaviour. In January 2011, Wal-Mart, 
the largest grocer in the United Sates, pledged 
to lower the price of fruits and vegetables, 
cut salt and sugar content in packaged foods, 
and stop selling any foods containing trans- 
fats within the next five years. 

In 2010, the US Department of Agriculture 
launched a pilot study to see if cutting the 
price of fruit and vegetables would prompt 
low-income families to eat more of them 
and reduce their consumption of less healthy 
foods. The 1,500 participating families will get 
30 cents added to their food benefit card bal- 
ance for every dollar they spend on fruits and 
vegetables. The study is set to begin in late 2011 
and wrap-up in 2013. 


RISK AND REWARD 


Through diet and exercise, individuals can take some 
control over their susceptibility to cancer. 


Fruits 


Salt, salted and 
salty foods 


Alcoholic drinks 
Physical activity 
Body fatness 


Lactation 


An alternative strategy would be to tax 
unhealthy foods, like soda and sugary sports 
drinks. Denmark already has such a tax. “They 
put a very high tax on sugar-sweetened bev- 
erages, a medium tax on diet beverages, and 
no tax on water and low-fat milk,” says Barry 
Popkin, a nutritionist at the University of 
North Carolina's Interdisciplinary Center for 
Obesity. 

A study published in February 2011 hints 
that taxing the unhealthy foods might be the 
best approach. Researchers at the State Univer- 
sity of New York at Buffalo recruited 42 moth- 
ers to shop at a simulated supermarket stocked 
with pictures of everything from whole wheat 
bread to bananas to carbonated sweet drinks. 
Each participant was given $22.50 and asked 
to select a week’s worth of food. During five 
‘shopping trips, the researchers manipulated 
the prices of the foods, first charging prices 
comparable to a local grocery store, then low- 
ering the price of healthy foods by 12.5% to 
25%, and then hiking the prices of unhealthy 
foods by roughly the same amount. Result: 
raising the price of junk food lowered the total 
calories purchased. Healthy food subsidies, on 
the other hand, increased the total amount of 
fat, protein and carbohydrates bought. The 
participants spent the money saved to buy 
junk food’. 

Neither taxes nor subsidies will end the 
obesity epidemic. But just because an interven- 
tion doesn't lead to weight loss doesn’t mean 
it’s a dud, says Katz. He compares obesity to a 
flood. Each intervention, he says, is a sandbag 
in a much-needed levee. “No one of them by 
itself can stop the flood,” he says. “Only when 
we've done enough of these things in enough 
places will they add up to be a levee that’s 
higher than the floodwaters.” 

Smoking still accounts for a third of all can- 
cer deaths. And tobacco, unlike most foods, is 
addictive. So kicking the habit often requires 
medical intervention. 

Most over-the-counter smoking-cessation 


Decreased risk Increased risk 


S2e2O SQ 


Convincing Probable Convincing Probable 


Cancer type 
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therapies, such as the nicotine patch and nico- 
tine gum, curb the symptoms of withdrawal by 
providing small doses of nicotine. Alternative 
strategies aim to make smoking less addictive. 
Varenicline, for example, a drug approved 
in 2006 and marketed as Chantix, partially 
binds to the nicotine receptor in the brain. It is 
designed to block nicotine, and also to partially 
activate the receptor. The idea is to prevent 
smokers from getting a rush if they smoke, 
but to give them enough dopamine to help 
curb cravings. And NicVax, an anti-smoking 
vaccine in phase III trials, prompts the body 
to raise an immune response against nicotine. 
“When someone smokes, the antibody attaches 
itself to the nicotine molecule,’ says Dorothy 
Hatsukami, a specialist in tobacco addiction at 
the University of Minnesota's Masonic Cancer 
Center. Bound together, the two molecules are 
too large to penetrate the blood brain barrier. 
“Tt reduces the level of nicotine that can enter 
the brain at any one time,” says Hatsukami, 
who is leading one of the NicVax trials. 

Anti-smoking regulations could have an 
even greater impact on cancer. In 2009, US 
lawmakers gave the Food and Drug Admin- 
istration (FDA) unprecedented power to 
regulate tobacco. The FDA now prohibits all 
flavourings, with the exception of menthol, 
and requires tobacco product manufacturers 
to register their products with the FDA. The 
statute also gives the FDA the power to limit 
the amount of nicotine in tobacco products. In 
theory, the agency could cut nicotine to levels 
that would “render the products less or non- 
addictive,” says Clifford Douglas, a tobacco 
control specialist who heads the University of 
Michigan Tobacco Research Network. Sucha 
restriction could have an enormous impact. 
“If one wants to cut seriously into the tobacco 
epidemic, they must deal with nicotine,’ Local 
regulations can play a role as well. Cities all 
over the world have banned smoking in bars 
and restaurants, but some are going a step 
further. In February 2011, city councillors in 
New York City voted to ban smoking in public 
parks, beaches and boardwalks. 

With respect to both obesity and smoking, 
part of the battle involves convincing people 
that much cancer is, in fact, avoidable. That 
could prove challenging, Marmot says. “I think 
most people think cancer is an act of God or 
Darwin? 


Cassandra Willyard is a freelance writer in 
New York. 
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PERSPECTIVE 


Tackling the real issues 


Successful prevention requires attacking the causes, says 
Stephen S. Hecht—and the main target remains tobacco. 


he best way to control cancer is to 
prevent it from happening. Data 
from the United States indicate that 
the age-standardized incidence of cancer 
continues to decrease in this century, mainly 
owing to decreases in the major cancers: 
lung, prostate and colorectal in men; breast 
and colorectal in women’. Similar trends 
exist in other Western countries, although 
not in less developed or transitioning 
ones’. These decreases are largely attribut- 
able to cancer prevention. Millions of lives 
have almost certainly been saved. 
Unlike prevention of heart disease, which 
is now a commonly understood goal of a 
healthy lifestyle, the concept of cancer pre- 
vention gets little attention. Curing cancer 
is newsworthy and glamorous. Prevention 
is not. One can meet the survivors of child- 
hood leukaemia or breast cancer and marvel 
at their good fortune. It is difficult to write 
an engrossing story of cancer about a person 
who did not get cancer in the first place. 
We now know a great deal about the 
causes of common cancers — and the bet- 
ter we understand cause, the more able we 
are to conceive of effective preventive meas- 
ures. Geographic and economic differences 
in cancer incidence and mortality are strik- 
ing. Common forms of cancer vary greatly 
between high-income countries and low- to 
moderate-income countries”’. Wealthy 
countries are beset by high incidences of 
lung, breast, prostate and colorectal can- 
cers. Migrants from one geographical area 
to another adopt the risk factors of their new 
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homes. Furthermore, the lifestyle habits of 
certain groups, such as Mormons and Sev- 
enth Day Adventists, lead to significantly 
lower cancer rates, in part because of their 
abstinence from tobacco and alcohol’. These 
facts unmask the role of lifestyle factors in 
the common cancers and illustrate the huge 
potential of cancer prevention (see Breaking 
the cancer habit, page S16). 

Cigarette smoking, the cause of 90% of 
lung cancer, stands out as one of the best 
examples in which cause and prevention are 
intimately linked. Smokers are exposed to 
multiple DNA-damaging carcinogens and 
consequently have acquired multiple muta- 
tions in genes that control cellular growth’. 
They are also exposed to multiple tumour 
promoting substances and inflammatory 
agents that exacerbate the process. The 
addictive power of nicotine perpetuates the 
persistent exposure to these toxicants. 

Effective tobacco control — led by 
clean air legislation, taxation, and anti- 
tobacco advertising — is contributing to 
decreased lung cancer incidence and in 
some Western countries. We can expand 
our preventive activities against tobacco- 
related cancer by achieving a better under- 
standing of the biochemical, genetic and 
behavioural mechanisms of smoking. This 
knowledge could help us identify people 
who have a particularly high susceptibil- 
ity to these cancers; we could then target 
those individuals for new prevention 
measures, such as a nicotine vaccination 
and chemoprevention. 


We know less about major causes of 
breast, prostate and colorectal cancer. Estro- 
gen is clearly involved in breast cancer aetiol- 
ogy and androgens are critical in prostate 
cancer development. Consumption of red 
meat cooked at high temperatures plays a 
role in colorectal cancer. We must expand 
our understanding of the causes of these 
cancers and translate it to cancer preven- 
tion strategies. 

There are other significant and success- 
ful strategies for primary and secondary 
cancer prevention: vaccination and screen- 
ing for cervical cancer; vaccination against 
hepatitis B and avoidance of aflatoxins for 
liver cancer; avoidance of excessive sunlight 
exposure for skin cancer; limiting alcohol 
consumption for head and neck cancer, and 
liver cancer. 

Education and public outreach are 
critical. Here, we might emulate our col- 
leagues in cardiovascular disease preven- 
tion research. There is a sharply increased 
awareness of the preventive power against 
heart disease of low-fat diets, cholesterol 
reduction (in part owing to statins), treat- 
ment of hypertension, exercise, and avoid- 
ance of tobacco. This awareness, along 
with improved medical care, has led toa 
significant reduction in deaths from heart 
disease — from about 320 deaths per 
100,000 people (younger than 85 years) in 
1975 to 130 deaths per 100,000 people in 
2006. This almost 60% drop in mortality 
rate far exceeds the 11% decrease in the 
rate of cancer deaths (from 180 to 160 per 
100,000 people) over the same period’. 

As we allocate research resources on 
the life-saving potential of cancer pre- 
vention, we need to keep in mind the 
prime importance of lifestyle factors in 
cancer aetiology, and continue to support 
studies to better understand the specific 
causes and mechanisms of common 
cancers. We need to identify susceptible 
individuals and target them for preven- 
tive interventions, including chemo- 
prevention using non-toxic or dietary 
agents with demonstrated efficacy. We 
should not be distracted by fleeting, 
flamboyant approaches that take us off 
the main task of dealing with the lifestyle 
factors that link cause and prevention. m 


Stephen S. Hecht is Wallin Professor of 
Cancer Prevention at the University of 
Minnesota’s Masonic Cancer Center. 
e-mail: hechtO02@umn.edu 
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Portents of malignancy 


Being able to determine an individual’s chances of developing cancer will greatly improve 
risk management strategies and recruitment to clinical trials. 


BY VICKI BROWER 


rette, imagine knowing your exact 

chance that this will lead to cancer. 
Although smoking is the main cause of lung 
cancer, only 10%-20% of smokers and former 
smokers actually develop the disease. The rea- 
sons for this — and for the changes that lead 
to many cancers — have eluded researchers for 
decades. 

Lung cancer, as the biggest cause of cancer 
death worldwide, is a priority of prevention 
strategies. And to make the biggest impact 
in cancer prevention, it is vital to target those 
individuals with the highest risk of develop- 
ing cancer. This is where biomarkers come in. 
Individuals identified as being high risk using 
biomarkers could receive counselling for life- 
style changes (see Breaking the cancer habit, 
page S16), or they might be eligible for chem- 
oprevention (see First line of defence, page 
S5). Even before that point, using biomarkers 
to select people for cancer prevention stud- 
ies would allow for more powerful trials (see 
Designing smarter cancer prevention trials, 
page S20). Indeed, there ist a clear distinc- 
tion between the types of biomarkers that help 
identify risk before cancer has developed and 
those that indicate that cancer is just starting 
to grow. 


B efore deciding to light your first ciga- 


The process of carcinogenesis takes years, if 
not decades. The search is on to discover and 
validate the often-subtle, microscopic changes 
in the constituents of the blood, sputum, urine 
or tissue samples that herald cancer. Prior to 
activated oncogenic pathways, there is the 
possibility of identifying sluggish DNA repair 
mechanisms, changes in gene expression, or 
detecting the low-level immune response to 
the presence of a nascent tumour. 

This research activity is not without its con- 
troversies. There have been numerous false 
starts, where promising biomarkers did not 
stand up to rigorous testing. To be useful, a bio- 
marker needs to have sensitivity — that is, the 
likelihood that it detects disease — of at least 
90%. The other key quality is specificity — the 
probability that a positive signal is a true sign 
of disease and not an error. That, too, should 
be 90% or more for a biomarker to be of clini- 
cal value. Although no lung cancer biomarkers 
yet meet that 90/90 standard, there are several 
promising candidates. 


CHANGING EXPRESSION 

Avrum Spira, a pulmonologist at Boston 
University, has been using bronchoscopy 
to brush cells from the bronchial airways of 
healthy smokers and non-smokers, followed 
by gene expression profiling to compare thou- 
sands of genes. He has found that the cells 


lining smokers’ airways show signs of genomic 
changes related to inflammation and cell pro- 
liferation, even when they appear normal with 
standard bronchoscopy. By comparing these 
results to those of smokers with suspected 
lung cancer, Spira’s group has identified an 
80-gene signature that could identify patients 
with early-stage lung cancer with about 90% 
sensitivity. This signature, says Spira, “is the 
proverbial canary in the coalmine”. 

However, bronchoscopy is an invasive 
procedure, so Spira broadened his search to 
more accessible parts of the airway. He recently 
found abnormal gene expression in epithelial 
cells of the nose and mouth that resemble those 
in the bronchial airway’. Analyzing these cells 
using a simple swab could serve “as a mass 
screening tool in population-based studies,” 
he predicts. 

Digging into the pathways underlying these 
precancerous changes, Spira, together with 
genomicist Andrea Bild from the University 
of Utah, found phosphatidylinosital 3-kinase 
(PI3K ) signalling pathway, already known to 
be involved in the development of cancer’. By 
comparing PI3K activity levels in cells from 
apparently healthy smokers to those of smok- 
ers with mild-to-moderate abnormalities, they 
found that this pathway “might be activated 
before tumorigenesis,’ says Spira, making PI3K 
a prime biomarker candidate. 
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DESIGNING SMARTER CANCER PREVENTION TRIALS 


In April 2010, an international team 
of researchers from academia and 
drug company GlaxoSmithKline 
reported that dutasteride, a drug 
already approved for the treatment 
of benign prostatic hyperplasia, 
reduced the chances that men 
considered at high risk for prostate 
cancer would develop the disease. 
The four-year trial included more 
than 8,100 men and met the gold 
standard for clinical trials: it was 
randomized, double-blind, and 
placebo-controlled; it studied 
parallel groups at multiple medical 
centres; and it assessed outcomes 
with biopsies at two years and 
four years. In the end, men who 
took dutasteride were 23% less 
likely to have a positive biopsy for 
cancer than those on the placebo. 
GSK submitted this data in its 
application to the US Food and 
Drug Administration to market the 
drug for prostate cancer prevention 


—this January, the FDA said No. 
Although it is not unusual for the 

FDA to reject a drug application 

supported by apparently positive 


data, this case illustrates the 
particular challenges surrounding 
clinical trials for cancer prevention. 
When the aim is to decrease 
the incidence of cancer in large 


populations, studies on preventive 
agents require large patient cohorts 
— sometimes approaching 20,000 
participants — and take years or 
even decades to perform. This 
combination makes them especially 
unwieldy compared to tests with 
therapeutic compounds, which can 
much more quickly be seen to work, 


or not, by testing them exclusively 
in people who have the disease. In 
cancer prevention drug trials, the 
usual gold standard barely rates a 
bronze. 

Since preventives are intended 
for apparently healthy patients, 
trials require a high confidence 
that the anticipated anticancer 
benefit will outweigh any harmful 
side effects. In the dutasteride trial, 
statistical analysis showed that 
the decrease in cancer was driven 
mainly by a reduction in less serious 
tumours that might not even require 
treatment. In addition, men who 
took the drug were slightly more 
likely than those on a placebo to 
develop more aggressive tumours. 
The FDA’s expert advisory panel 
concluded that the prevention 
benefits failed to outweigh this risk. 

Researchers say two things are 
needed to decrease the length and 
size of prevention studies. One is 


The PI3K pathway might also be used for 
chemoprevention. Early trials have shown that 
the administration ofa compound that decreases 
PI3K activity causes regression of abnormal 
lesions in the airways of high-risk smokers’. 


DNA DAMAGE 

As part of daily living, DNA frequently sustains 
damage. If not repaired, this can lead to muta- 
tions that replicate, resulting in abnormal and 
then cancerous growths. Certain mechanisms 
usually prevent this from occurring. The enzyme 
8-oxoguanine DNA glycosylase (OGG1) repairs 
DNA by excising damaged bases (see DNA repair 
duties, page $21). Biochemists Zvi Livneh and 
Tamar Paz-Elizur, at the Weizmann Institute in 
Rehovot, Israel, discovered that levels of OGG1 
can also be used to predict an individual’s risk of 
developing lung cancer. 

By measuring OGGI concentration in blood 
samples, Livneh and Paz-Elizur found that 40% 
of people with lung cancer had low levels of the 
enzyme compared to 4% of healthy individuals. 
Smokers with low OGG1 activity were 5 to 10 
times more likely to develop lung cancer than 
smokers with normal OGG1; when compared to 
non-smokers with normal OGGI activity, the risk 
skyrocketed to 120 times more likely. The same 
blood test could be broadened to other cancers. 
For example, smokers with 
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OGG1 is only one of an unknown num- 
ber of DNA repair enzymes; low levels of 
any of them could be associated with can- 
cer development. Livneh and Paz-Elizur 
have expanded their research to include two 
additional DNA repair enzymes — AAG 
and APE1 — to cover people with “different 
risk factors to develop a certain cancer’, says 
Livneh. A study is underway to access their 
performance and results are expected in 
mid-2011. 

It is unlikely that any single test, however 
many markers included, will be sufficient to 
gauge the risk of cancer development. “We 
have an additional ongoing study which 
explores a two-stage protocol for lung can- 
cer prevention,” says Livneh. The first stage 
involves Livneh and Paz-Elizur’s DNA repair 
biomarkers plus five biomarkers developed 
by other groups. These biomarkers measure: 
alteration in gene expression; levels of DAP 
kinase, an enzyme involved in programed cell 
death; antibodies to mutant p53, a sign that a 
cell’s tumour suppressor system is damaged; 
markers of inflammation; and variations in 
cancer-related genes. “Together these bio- 
markers are expected to yield a better risk 
assessment than one type alone,’ says Livneh. 
Individuals identified as high risk will be 
tested using spiral computed tomography 
(CT). “For such a high-risk group, spiral CT 
early detection of lung cancer might be cost- 
effective and life saving,’ adds Livneh. 

In the initial stages of cancer, the body 
is often able to recognize abnormal cell 
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changes and raise a response, producing auto- 
antibodies. However, this response is limited, 
and in the later stages of cancer, the immune 
system becomes compromised and can no 
longer identify and attack cancer cells. Auto- 
antibodies are therefore prime candidates for 
biomarkers of early stage cancer. 

By examining auto-antibody formation in 
presymptomatic individuals who later went 
on to develop lung cancer, Samir Hanash, at 
the Fred Hutchinson Cancer Center in Seat- 
tle, Washington, has identified three impor- 
tant antigens — annexin-1, 14-3-3 Theta and 
LAMR-1 — regarded by the immune system 
as foreign’. So far, specificity of these biomarkers 
is high but sensitivity lingers around 60%. 
The challenge for Hanash is to find additional 
candidate antigens that improve on the perfor- 
mance of this 3-antigen panel. 

These figures might be improved by look- 
ing for even earlier signs of cancer. Through 
the Women’s Health Initiative and Physician's 
Health Study, Hanash has access to blood sam- 
ples that were collected up to eight years before 
a patient was diagnosed with lung cancer. In 
addition, he is searching for biomarkers of 
lung cancer in former smokers and in people 
who never smoked. “It turns out that most of 
the blood markers we have identified among 
smokers are also applicable to non-smokers,” 
says Hanash. 

In spite of major investment in biomarker 
development over the past 15 years, the field 
of cancer prevention biomarkers looks woe- 
fully thin. One of the main reasons, according 
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to identify high-risk populations 
to be the preferred subjects for 
the trials. The second is surrogate 
endpoints that can provide evidence 
of whether a preventive drug is 
working — and do this in just a 
few years, rather than decades. 
The key to both is finding better 
biomarkers — the genes, proteins, 
and cellular metabolites that can be 
measured and associated with the 
development of cancer. 

Patterns of these biomarkers 
that can be uniquely linked with 
one type of cancer can make it 
easier to estimate an individual’s 
cancer risk. Selecting highest-risk 
patients for studies increases the 
statistical power of trials with a 
smaller number of participants. 
As a second benefit, high-risk 
cohorts can also shorten trials. If 
epidemiological studies show, for 
example, that a known percentage 
of patients carrying a certain gene 
will develop cancer within five 
years, researchers can restrict a 
prevention trial to those patients 


and run it for just that duration. 
Moreover, patients and regulators 
are likely to be more tolerant of side 
effects if the targeted users have a 
high chance of developing cancer 
without intervention. 

The designers of the dutasteride 
trial did select participants judged 
to be at higher risk of developing 
prostate cancer. However, they did 
so by looking for elevated levels 
of prostate-specific antigen (PSA) 
— a protein whose utility as a 
biomarker for prostate cancer is a 
matter of debate. If a fully validated 
biomarker for prostate cancer had 
existed, GSK might have been able 
to design a dutasteride trial that 
required fewer participants and 
could have yielded a more definitive 
outcome. In particular, looking at 
the drug’s effect (or lack thereof) on 
the biomarker may have clarified 
whether the increase in detected 
higher-grade cancers was due to 
the drug or simply an artefact of 
the tumours becoming more easily 
detected owing to dutasteride’s 


shrinking of the prostate. 

Some biomarkers may even 
function as the surrogate endpoints 
needed to shorten prevention 
trials. If, say, a specific group of 
proteins reliably increases in 
the blood of patients during the 
earliest, precancerous stages of 
disease, doctors could monitor 
those proteins rather than relying 
on biopsies to detect malignancy. 
Molecular biomarkers of potential 
toxicity, such as the activity of drug- 
metabolizing enzymes, could also 
help researchers monitor subjects’ 
safety and response to drug 
candidates in clinical trials. 

Scott Lippman, an oncologist and 
cancer prevention researcher at the 
University of Texas MD Anderson 
Cancer Center, has proposed 
fully integrating biomarkers 
chemoprevention development. 
After evaluating biomarkers in 
animal models, researchers would 
do epidemiologic studies linking the 
biomarkers to human cancers. They 
would next model the likelihood that 
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patients with specific biomarkers 
will develop cancer. Then, in a ‘phase 
0’ step between preclinical and 
phase | clinical trials, researchers 
could test sub-therapeutic doses to 
assess a drug’s behaviour in healthy 
patients without risking harm. 
Lippman argues that this approach 
could yield better decisions on 
whether to undertake a lengthy, and 
costly phase Ill trial — and speed 
the development of preventive 
agents. Indeed, the fact that 
GlaxoSmithKline skipped some of 
these steps might have played a role 
in the FDA's decision on dutasteride. 
The drug inhibits the enzyme that 
converts testosterone to the more 
potent 5o-dihydrotestosterone. But 
neither molecule is yet a validated 
biomarker for prostate cancer. 


Erika Jonietz is a science writer 
in Austin, Texas. 
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to Eleftherios Diamandis, a clinical biochem- 
ist at the University of Toronto, is because of 
poor study design with weak endpoints and 
little statistical rigor’. Furthermore, most 
research efforts have focused on biomarkers 
that monitor treatment. In fact, most biomark- 
ers in clinical use are not suitable for popula- 
tion screening or for early diagnosis, observes 
Diamandis. 

Diamandis claims that previous research 
into cancer biomarkers were looking in the 
wrong places. Too often efforts have focused 
on genetic markers, which in terms of cancer 
“represent ‘digital information’ — yes or no. 
This is not true for metabolomic or proteomic 
biomarkers, which are associated with quan- 
titative changes’, he says. But such biological 


markers are delicate. “They can be influenced 
by sample collection and storage methods, 
benign diseases, and even diet and drugs,” he 
explains. A difficulty of identifying quantita- 
tive biomarkers that are both highly sensitive 
and highly specific is that data analytical biases 
are introduced. “It is not surprising that seem- 
ingly spectacular data on new biomarkers are 
subsequently found to be not reproducible, 
and therefore unsuitable for use in clinical 
practice,” Diamandis concludes. 

George Poste, head of the Complex Adap- 
tive Systems Initiative at Arizona State Univer- 
sity in Tempe, agrees that biomarker research 
is yet to deliver on its promise for these and 
other reasons. Part of the problem, he says, is 
that until recently, most investigator-initiated 
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research has been too small and non-uniform 
to yield meaningful results. A lack of stand- 
ardization in sample collection and processing, 
the use of cell lines instead of patient biopsies 
for research, and an insufficient number of 
patient samples are reasons for the dearth of 
meaningful biomarker development. More- 
over, the field needs much more funding to 
encourage collaborative research and a ‘big 
science’ approach, says Poste. Government 
and industry funding must step up to the 
plate, he adds. 

Poste cites the US National Cancer Institute's 
Cancer Human Biobank and the UK’s Biobank 
as successful examples of big science and what 
it can do when the community invests in this 
research. He notes that historically the lion’s 
share of cancer funding has gone to treat- 
ment, not prevention. “But the real issue is 
how can we catch cancer very early on, before 
it spreads,” says Poste. This is the realm of bio- 
markers. “If we can find cancer in its earliest 
stages, it might be possible in the future to 
prevent it? = 


Vicki Brower is a freelance writer in New York. 
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UTR CANCER PREVENTION 


DESIGNING SMARTER CANCER PREVENTION TRIALS 


In April 2010, an international team 
of researchers from academia and 
drug company GlaxoSmithKline 
reported that dutasteride, a drug 
already approved for the treatment 
of benign prostatic hyperplasia, 
reduced the chances that men 
considered at high risk for prostate 
cancer would develop the disease. 
The four-year trial included more 
than 8,100 men and met the gold 
standard for clinical trials: it was 
randomized, double-blind, and 
placebo-controlled; it studied 
parallel groups at multiple medical 
centres; and it assessed outcomes 
with biopsies at two years and 
four years. In the end, men who 
took dutasteride were 23% less 
likely to have a positive biopsy for 
cancer than those on the placebo. 
GSK submitted this data in its 
application to the US Food and 
Drug Administration to market the 
drug for prostate cancer prevention 


—this January, the FDA said No. 
Although it is not unusual for the 

FDA to reject a drug application 

supported by apparently positive 


data, this case illustrates the 
particular challenges surrounding 
clinical trials for cancer prevention. 
When the aim is to decrease 
the incidence of cancer in large 


populations, studies on preventive 
agents require large patient cohorts 
— sometimes approaching 20,000 
participants — and take years or 
even decades to perform. This 
combination makes them especially 
unwieldy compared to tests with 
therapeutic compounds, which can 
much more quickly be seen to work, 


or not, by testing them exclusively 
in people who have the disease. In 
cancer prevention drug trials, the 
usual gold standard barely rates a 
bronze. 

Since preventives are intended 
for apparently healthy patients, 
trials require a high confidence 
that the anticipated anticancer 
benefit will outweigh any harmful 
side effects. In the dutasteride trial, 
statistical analysis showed that 
the decrease in cancer was driven 
mainly by a reduction in less serious 
tumours that might not even require 
treatment. In addition, men who 
took the drug were slightly more 
likely than those on a placebo to 
develop more aggressive tumours. 
The FDA’s expert advisory panel 
concluded that the prevention 
benefits failed to outweigh this risk. 

Researchers say two things are 
needed to decrease the length and 
size of prevention studies. One is 


The PI3K pathway might also be used for 
chemoprevention. Early trials have shown that 
the administration ofa compound that decreases 
PI3K activity causes regression of abnormal 
lesions in the airways of high-risk smokers’. 


DNA DAMAGE 

As part of daily living, DNA frequently sustains 
damage. If not repaired, this can lead to muta- 
tions that replicate, resulting in abnormal and 
then cancerous growths. Certain mechanisms 
usually prevent this from occurring. The enzyme 
8-oxoguanine DNA glycosylase (OGG1) repairs 
DNA by excising damaged bases (see DNA repair 
duties, page $21). Biochemists Zvi Livneh and 
Tamar Paz-Elizur, at the Weizmann Institute in 
Rehovot, Israel, discovered that levels of OGG1 
can also be used to predict an individual’s risk of 
developing lung cancer. 

By measuring OGGI concentration in blood 
samples, Livneh and Paz-Elizur found that 40% 
of people with lung cancer had low levels of the 
enzyme compared to 4% of healthy individuals. 
Smokers with low OGG1 activity were 5 to 10 
times more likely to develop lung cancer than 
smokers with normal OGG1; when compared to 
non-smokers with normal OGGI activity, the risk 
skyrocketed to 120 times more likely. The same 
blood test could be broadened to other cancers. 
For example, smokers with 
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OGG1 is only one of an unknown num- 
ber of DNA repair enzymes; low levels of 
any of them could be associated with can- 
cer development. Livneh and Paz-Elizur 
have expanded their research to include two 
additional DNA repair enzymes — AAG 
and APE1 — to cover people with “different 
risk factors to develop a certain cancer’, says 
Livneh. A study is underway to access their 
performance and results are expected in 
mid-2011. 

It is unlikely that any single test, however 
many markers included, will be sufficient to 
gauge the risk of cancer development. “We 
have an additional ongoing study which 
explores a two-stage protocol for lung can- 
cer prevention,” says Livneh. The first stage 
involves Livneh and Paz-Elizur’s DNA repair 
biomarkers plus five biomarkers developed 
by other groups. These biomarkers measure: 
alteration in gene expression; levels of DAP 
kinase, an enzyme involved in programed cell 
death; antibodies to mutant p53, a sign that a 
cell’s tumour suppressor system is damaged; 
markers of inflammation; and variations in 
cancer-related genes. “Together these bio- 
markers are expected to yield a better risk 
assessment than one type alone,’ says Livneh. 
Individuals identified as high risk will be 
tested using spiral computed tomography 
(CT). “For such a high-risk group, spiral CT 
early detection of lung cancer might be cost- 
effective and life saving,’ adds Livneh. 

In the initial stages of cancer, the body 
is often able to recognize abnormal cell 
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changes and raise a response, producing auto- 
antibodies. However, this response is limited, 
and in the later stages of cancer, the immune 
system becomes compromised and can no 
longer identify and attack cancer cells. Auto- 
antibodies are therefore prime candidates for 
biomarkers of early stage cancer. 

By examining auto-antibody formation in 
presymptomatic individuals who later went 
on to develop lung cancer, Samir Hanash, at 
the Fred Hutchinson Cancer Center in Seat- 
tle, Washington, has identified three impor- 
tant antigens — annexin-1, 14-3-3 Theta and 
LAMR-1 — regarded by the immune system 
as foreign’. So far, specificity of these biomarkers 
is high but sensitivity lingers around 60%. 
The challenge for Hanash is to find additional 
candidate antigens that improve on the perfor- 
mance of this 3-antigen panel. 

These figures might be improved by look- 
ing for even earlier signs of cancer. Through 
the Women’s Health Initiative and Physician's 
Health Study, Hanash has access to blood sam- 
ples that were collected up to eight years before 
a patient was diagnosed with lung cancer. In 
addition, he is searching for biomarkers of 
lung cancer in former smokers and in people 
who never smoked. “It turns out that most of 
the blood markers we have identified among 
smokers are also applicable to non-smokers,” 
says Hanash. 

In spite of major investment in biomarker 
development over the past 15 years, the field 
of cancer prevention biomarkers looks woe- 
fully thin. One of the main reasons, according 


JAHI CHIKWENDIU/THE WASHINGTON POST VIA GETTY IMAGES 


to identify high-risk populations 
to be the preferred subjects for 
the trials. The second is surrogate 
endpoints that can provide evidence 
of whether a preventive drug is 
working — and do this in just a 
few years, rather than decades. 
The key to both is finding better 
biomarkers — the genes, proteins, 
and cellular metabolites that can be 
measured and associated with the 
development of cancer. 

Patterns of these biomarkers 
that can be uniquely linked with 
one type of cancer can make it 
easier to estimate an individual’s 
cancer risk. Selecting highest-risk 
patients for studies increases the 
statistical power of trials with a 
smaller number of participants. 
As a second benefit, high-risk 
cohorts can also shorten trials. If 
epidemiological studies show, for 
example, that a known percentage 
of patients carrying a certain gene 
will develop cancer within five 
years, researchers can restrict a 
prevention trial to those patients 


and run it for just that duration. 
Moreover, patients and regulators 
are likely to be more tolerant of side 
effects if the targeted users have a 
high chance of developing cancer 
without intervention. 

The designers of the dutasteride 
trial did select participants judged 
to be at higher risk of developing 
prostate cancer. However, they did 
so by looking for elevated levels 
of prostate-specific antigen (PSA) 
— a protein whose utility as a 
biomarker for prostate cancer is a 
matter of debate. If a fully validated 
biomarker for prostate cancer had 
existed, GSK might have been able 
to design a dutasteride trial that 
required fewer participants and 
could have yielded a more definitive 
outcome. In particular, looking at 
the drug’s effect (or lack thereof) on 
the biomarker may have clarified 
whether the increase in detected 
higher-grade cancers was due to 
the drug or simply an artefact of 
the tumours becoming more easily 
detected owing to dutasteride’s 


shrinking of the prostate. 

Some biomarkers may even 
function as the surrogate endpoints 
needed to shorten prevention 
trials. If, say, a specific group of 
proteins reliably increases in 
the blood of patients during the 
earliest, precancerous stages of 
disease, doctors could monitor 
those proteins rather than relying 
on biopsies to detect malignancy. 
Molecular biomarkers of potential 
toxicity, such as the activity of drug- 
metabolizing enzymes, could also 
help researchers monitor subjects’ 
safety and response to drug 
candidates in clinical trials. 

Scott Lippman, an oncologist and 
cancer prevention researcher at the 
University of Texas MD Anderson 
Cancer Center, has proposed 
fully integrating biomarkers 
chemoprevention development. 
After evaluating biomarkers in 
animal models, researchers would 
do epidemiologic studies linking the 
biomarkers to human cancers. They 
would next model the likelihood that 
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patients with specific biomarkers 
will develop cancer. Then, in a ‘phase 
0’ step between preclinical and 
phase | clinical trials, researchers 
could test sub-therapeutic doses to 
assess a drug’s behaviour in healthy 
patients without risking harm. 
Lippman argues that this approach 
could yield better decisions on 
whether to undertake a lengthy, and 
costly phase Ill trial — and speed 
the development of preventive 
agents. Indeed, the fact that 
GlaxoSmithKline skipped some of 
these steps might have played a role 
in the FDA's decision on dutasteride. 
The drug inhibits the enzyme that 
converts testosterone to the more 
potent 5o-dihydrotestosterone. But 
neither molecule is yet a validated 
biomarker for prostate cancer. 


Erika Jonietz is a science writer 
in Austin, Texas. 
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to Eleftherios Diamandis, a clinical biochem- 
ist at the University of Toronto, is because of 
poor study design with weak endpoints and 
little statistical rigor’. Furthermore, most 
research efforts have focused on biomarkers 
that monitor treatment. In fact, most biomark- 
ers in clinical use are not suitable for popula- 
tion screening or for early diagnosis, observes 
Diamandis. 

Diamandis claims that previous research 
into cancer biomarkers were looking in the 
wrong places. Too often efforts have focused 
on genetic markers, which in terms of cancer 
“represent ‘digital information’ — yes or no. 
This is not true for metabolomic or proteomic 
biomarkers, which are associated with quan- 
titative changes’, he says. But such biological 


markers are delicate. “They can be influenced 
by sample collection and storage methods, 
benign diseases, and even diet and drugs,” he 
explains. A difficulty of identifying quantita- 
tive biomarkers that are both highly sensitive 
and highly specific is that data analytical biases 
are introduced. “It is not surprising that seem- 
ingly spectacular data on new biomarkers are 
subsequently found to be not reproducible, 
and therefore unsuitable for use in clinical 
practice,” Diamandis concludes. 

George Poste, head of the Complex Adap- 
tive Systems Initiative at Arizona State Univer- 
sity in Tempe, agrees that biomarker research 
is yet to deliver on its promise for these and 
other reasons. Part of the problem, he says, is 
that until recently, most investigator-initiated 


DNA REPAIR DUTIES as 
8-Oxoguanine DNA glycosylase (OGG1) removes : 
bases that have been damaged by tobacco : 
smoke, ionizing radiation or oxidative stress. 


Damage to 
DNA oxidizes 
guanine... 


NORMAL LEVELS OF OGG1 


. aN 
j 


di 


G 
OGG1 attaches to 8-oxoG and snips it out 
to be later replaced with a new guanine. 


...to create 
8-oxoguanine 
(8-oxoG), a 
mutagenic 
lesion. 


8-ox0G 


LOW LEVELS OF OGG1 


C A 
j | ) N ) 
If 8-oxoG is not removed it will mispair 
with adenine to create a genetic mutation. 


research has been too small and non-uniform 
to yield meaningful results. A lack of stand- 
ardization in sample collection and processing, 
the use of cell lines instead of patient biopsies 
for research, and an insufficient number of 
patient samples are reasons for the dearth of 
meaningful biomarker development. More- 
over, the field needs much more funding to 
encourage collaborative research and a ‘big 
science’ approach, says Poste. Government 
and industry funding must step up to the 
plate, he adds. 

Poste cites the US National Cancer Institute's 
Cancer Human Biobank and the UK’s Biobank 
as successful examples of big science and what 
it can do when the community invests in this 
research. He notes that historically the lion’s 
share of cancer funding has gone to treat- 
ment, not prevention. “But the real issue is 
how can we catch cancer very early on, before 
it spreads,” says Poste. This is the realm of bio- 
markers. “If we can find cancer in its earliest 
stages, it might be possible in the future to 
prevent it? = 


Vicki Brower is a freelance writer in New York. 
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The omnivore’s labyrinth 


Finding the right food to help reduce our chances of cancer can be a maze. But ongoing 
studies and alittle inventive cooking might point us in the right direction. 


BY SARAH DEWEERDT 


wenty years ago, Paul Talalay was look- 
| ing for new ways to prevent cancer, so 
he went grocery shopping. Asa result 
of his trip to the supermarket, the Johns Hop- 
kins pharmacologist and molecular scientist 
discovered sulforaphane, a compound con- 
tained in certain leafy vegetables. In a simple 
assay using mouse cells, Talalay and colleagues 
showed that sulforaphane dramatically boosts 
the activity of certain phase II enzymes, 
which form part of the body’s cancer-fighting 
machinery’. Later, they demonstrated sul- 
foraphane’s capacity to prevent tumour growth 
in rats exposed to a carcinogen’. 

The rest of us haven't been so fortunate 
with our anticancer shopping basket. Despite 
much research over the past 40 years, it’s still 
not clear what to eat — or not eat —to help 
prevent cancer. Promising initial findings have 
often turned into statistical dead ends, leaving 
us culinarily confused. 

In the mid-1970s, epidemiological stud- 
ies suggested that people who ate more fruits 
and vegetables were at lower risk of several 
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cancers’. Such findings led to public health 
efforts to get people to eat more fresh produce. 
In both the United States and the United King- 
dom, for example, official dietary recommen- 
dations pushed for everyone to consume five 
portions of fruit and vegetables each day. 
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COMMON SOURCE : . 
Cranberries and other berries 


CANCER TYPE : Various 


BEST PREVENTION elle 
Flavonoids from cranberries © 
inhibited proliferation of 6 different 
cancer cell lines." 


STUDY SUBJECTS: In vitro human 
> cancer cell lines 


1. Ferguson, P. J. et al, J. Nutr. 134, 1529-1 535 
(2004). 


As epidemiologists began to use large- 
scale, prospective studies — a more powerful 
type of investigation — they frequently found 
weak, inconsistent or no links between 
fruit and vegetable consumption and cancer 
risk. Last year, University of Oxford epidemi- 
ologist Tim Key concluded from nearly 
three dozen large studies and meta-analyses 
from the last 20 years that “at least in relatively 
well-nourished Westernised populations, a 
general increase in total fruit and vegetable 
intake will not have a large impact on cancer 
rates’ (ref. 4). 

This is the story for many other aspects of 
diet. A high-fat diet was thought to lead to an 
increased risk of breast cancer, until it wasnt. 
Early studies of dietary fibre suggested that 
higher consumption could decrease chances 
of colon cancer, but that didn’t pan out either. 

And yet, as Talalay found, some foods do 
contain anticancer compounds. These so- 
called phytonutrients include resveratrol 
in grapes, curcumin in turmeric, and many 
others (see recipe cards). Some of these mol- 
ecules, including sulforaphane and genistein 
(an isoflavone found in soybeans), are now on 
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1. Ambrosone, C. B. et al. J. Nutr. 134, 
1134-1138 (2004). 


their way to becoming pharmaceutical agents 
in cancer prevention (see First line of defence, 
page S5). 

So why is figuring out the protective quali- 
ties of food so complicated? As scientists probe 
the complexities of diet and cancer, they are 
finding a host of reasons, from differences 
in types of vegetables to variations in human 
metabolism. At least seven factors underscore 
the difficulties. 

1. All vegetables are not created equal. Phy- 
tonutrients are most often found in pungent 
vegetables (like onions and garlic), bitter ones 
(like mustard greens), or ones with acquired 
tastes (like mushrooms). There are also “a 
number of fruit and vegetables that are not 
major sources of phytonutrients,” says Johanna 
Lampe, an epidemiologist at the Fred Hutch- 
inson Cancer Research Center in Seattle. 
And unfortunately, she says, these deficient 
foods include some of the most popular fruit 
and vegetables in the Western diet — apples 
and potatoes. 

When Talalay analysed his trolley-load of 
fruit and vegetables from the supermarket 
for potential cancer-fighting activity, he 
found “a vast difference” between different 
types of produce, he recalls. The most potent 
were cruciferous vegetables (those in the 
cabbage family), specifically broccoli and its 
payload of glucoraphanin — the precursor of 
sulforaphane. 

2. All heads of broccoli are not created equal. 
“So then we asked the question, is broccoli a 
good way to deliver sulforaphane?” says 
Talalay. “It turned out the answer was abso- 
lutely no” Different heads of broccoli varied 
20-fold in their content of glucoraphanin. The 
specific variety, growing conditions, time of 
year, distance of transport and other factors 
all affect the concentration of phytonutrients. 

Working with Johns Hopkins plant physi- 
ologist Jed Fahey, Talalay found that broc- 
coli seeds were 10 to 100 times richer in 
glucoraphanin than adult plants, and cer- 
tain varieties of seeds contained predictable 
amounts of the molecule. In fact, the most 


consistent way to deliver glucoraphanin was 
to use three-day-old broccoli sprouts’, 

Broccoli sprouts have become a powerful 
research tool for Talalay — and dozens of other 
scientists — to explore the role of sulforaphane 
in laboratory, animal and now human studies 
of cancer prevention. They also became a new 
food: Talalay likes to serve them at lab meet- 
ings with bagels and cream cheese. 

3. Human genomes vary. Eating a known 
amount of phytonutrient doesn’t guarantee 
that a predictable amount of the cancer-fight- 
ing molecule will enter the bloodstream. 
Differences here can be traced to variations in 
the genes involved in the digestive processes. 

For example, the glutathione S-transferase 
M1 gene (GSTM1) influences the speed at 
which the body metabolizes sulforaphane and 
expels it in the urine. The faster it happens, 
the less beneficial the broccoli. Agricultural 
companies have developed several varie- 
ties of ‘super broccoli’ that are high in both 
glucoraphanin and related molecules to com- 
pensate for the effects of faster variants of the 
GSTMI gene. 

GSTM1 is the best studied of the genes 
that influence phytonutrient metabolism, 
but it is just one of a rapidly growing list. For 
example, people who carry two copies of a 
particular variant of the UGT1A1 gene make 
about 30% to 40% less than normal of a type 
of phase II enzyme. One study has shown that 
people with this genotype derive more cancer- 
protecting benefit from eating cabbage- and 
carrot-family vegetables — possibly because 
phytonutrients in these foods boost UGT1A1 
activity closer to normal’. 

4, Human microbiomes vary. Some of the 
genes that determine the power of phytonu- 
trients are not human. Gut bacteria heavily 
metabolize the phytonutrients from soy, turn- 
ing one type of isoflavones into another. So 
depending on their intestinal bacteria, two 
people who eat the same amount of soy each 
day might receive not only different quanti- 
ties of isoflavones, but also different end- 
products’. Between 30% and 50% of people 
harbour bacteria that produce equol, which 
some scientists believe is one of the more 
beneficial forms of isoflavone; around 80% 
to 90% of people have bacteria that produce 
O-desmethylangolensin, a less active molecule. 

Gut microbiota reflect a complex interplay 
of diet and genetics. For example, vegetarians 
are more likely to produce equol than non- 
vegetarians, and Korean-American women 
are more likely to produce equol than white 
American women. 

5. Timing is everything. Epidemiologi- 
cal studies of Asian populations show that 

higher consumption 


D> NATURE.COM of soy foods — tofu, 
forsome ofthelatest miso and the like — is 
links between diet associated with lower 
and caner breast cancer risk. Yet, 
go.nature.com/NdE8Qi = soy seems to provide 
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little protection to people who otherwise 
eat a typical Western diet’. 

Equol may be part of that story, but there is 
also growing evidence that the age at which a 
person starts eating soy is critical to its effect. 
“In Asian women, soy seems to be more pro- 
tective, but the reason for that is probably that 
the Asian women started eating soy early in 
life versus most Caucasian women starting 
to eat it later,” explains Gertraud Maskarinec, 
an epidemiologist at the University of Hawaii 
Cancer Center. 

Much of the evidence for this hypothesis 
comes from animal studies: rats fed soy when 
they are young have fewer mammary tumours 
later in life, but rats fed soy only as adults do 
not get this benefit. There are a few suggestive 
clues in humans too. For example, women in 
Asia who grew up eating a traditional, soy-rich 
diet and then move to the West as adults do not 
seem to increase their risk of breast cancer’. 

6. Some phytonutrients are difficult to access. 
Many are found in small quantities in bulky 
foodstuffs, or only in particular types of sea- 
sonal fruit and vegetables. This means it is 
impractical to eat enough to noticeably reduce 
cancer risk. For example, many berries are rich 
in a group of phytonutrients called anthocya- 
nins, which are antioxidants and may have 
other cancer-fighting effects as well. “The 
problem with berries is that they are expensive 
and seasonal, so availability is a limiting factor 
for a lot of people,’ says Cathie Martin, a plant 
biologist at the John Innes Centre in Norwich, 
United Kingdom. 

Martin led a team that genetically engi- 
neered a tomato (which has few natural 
anthocyanins) to contain roughly the same 
concentration of anthocyanins as blueber- 
ries. In 2008, they showed that this deep 
purple tomato, known as Del/Ros1N, slowed 
tumour progression in a cancer-prone strain 
of mice. The mice fed Del/Ros1N lived longer 
than mice fed either ordinary red tomatoes or 
standard laboratory fare’’. 

Unlike berries, tomatoes are one of the 
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CANCER TYPE : Glioma (brain) 


BEST PREVENTION RESULTS: A 
dally dose of 40 mg/kg achieved 
long-term survival for 70% of 
subjects,! 


STUDY SUBJECTS : Rats 
harboring small subcutaneous 
gliomas 
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PURPLE TOMATOES: JOHN INNES CENTRE. 
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largest food crops in the world. Modern grow- 
ing methods make them available all year 
round, and they’re a popular and frequently 
consumed food. The anthocyanin-rich tomato 
could make potential anticancer foods much 
more readily available (though purple pizza 
sauce might take some getting used to). 

Martin’s team has also engineered a tomato 
strain that has a resveratrol concentration one 
thousand-times higher than in red wine. 
Resveratrol has anticancer activity in labora- 
tory studies, but the quantities in red wine are 
so small that any benefit from the resveratrol 
would be far outweighed by the adverse effects 
of drinking excessive volumes of alcohol. “I’m 
not trying to take away anybody’s red wine,” 
Martin laughs — but she suggests that the 
tomato could be a more practical approach 
for resveratrol delivery. 

7. The whole (diet) is greater than the sum 
of its parts. Sometimes, perhaps, it’s not one 
food but the whole recipe that can be protec- 
tive. That’s what Thomas Kensler, an environ- 
mental health specialist, found when he began 
to study broccoli-sprout tea in Qidong, on the 
east coast of China. Kensler, based at Johns 
Hopkins University, expected the local people’s 
gut bacteria to transforms the tea’s glucorapha- 
nin into active sulforaphane. “There was tre- 
mendous interindividual variability,” he says. 
“Some people were really good and some were 
really bad at this conversion reaction.” 

Talalay suggested to Kensler that adding a 


COMMON SOURCE : 
. Turmeric, curry 


CANCER TYPE : 


___ Colon 


—_— : 
BEST PREVENTION RESULTS: 
Reduced the number of precancerous 
tumors by up to 40%.' 

Reduced the number of precancer- 
ous tumors by an average of 60% and 
their size by an average of 50%.2 


STUDY SUBJECTS : Mice, bred to be 
genetically susceptible to colon cancer! 

Humans who carry a genetic mutation 
of susceptibility to colon cancer? 


1. Perkins, S. et al. Cancer Epidemiol. Biomarkers 
Prev. 11, 535-540 (2002). 

2. Cruz-Correa, M. et al. Clin. Gastroenterol. 
Hepatol. 4, 1035-1038 (2006). 


bit of daikon radish to the 
tea might help eliminate 
these differences. Daikon 
radish contains an enzyme, 
myrosinase, that catalyses 
the glucoraphanin con- 
version. It was a neat trick 
— and one that demonstrates 
why it can be so difficult to isolate 
anticancer elements in a diet. 

It is likely that these effects are not limited 
to experimental concoctions. Gut bacteria 
influence how phytonutrients are processed, 
but fibre intake and other aspects of the diet 
alter the gut ecosystem. Traditional Asian diets 
include not just large helpings of soy but also 
generous pours of green tea, which contains 
cancer-fighting epigallocatechins. Little is 
known about how diverse foods in a diet inter- 
act to affect cancer risk. 

Specialized diets are far from the only way 
to decrease our chances of cancer, but they do 
add flexibility to cancer prevention strategies. 
Kensler began investigating sulforaphane 
after working for several years in China. One 
in ten people will develop liver cancer there 
during their lifetime so cancer prevention 
can make a real impact. Yet pharmaceuticals 
would be too expensive for many people to 
buy, and some people simply dontt like tak- 
ing pills. “I began to recognize that Western 
approaches aren't going to work around the 
world,’ Kensler recalls. 

A broccoli-sprout tea was the perfect choice 
for this population. “Culturally it fits right in 
with their behavioural patterns and their own 
notions of how to preserve health,” he says. 
Elsewhere, raw broccoli sprouts might be a 
more popular addition to diets. Some popu- 
lations will be open to getting phytonutrients 
from a genetically engineered tomato, while 
other societies would recoil from designer 
foods. And in some cultures, purified phyto- 
nutrients in dietary supplement form might be 
the best approach. “Different delivery mecha- 
nisms are going to be appropriate for different 
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COMMON SOURCE : 
Many different dishes 


CANCER TYPE : 
Colorectal 


BEST PREVENTION RESULTS: 
High-consumers have 30% lower 
risk of cancer than the lowest 
consumers," 


12-month consumption of garlj 
jarlic 

extract meant total size of polyps 

was one-third those of controls'2 
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target populations,’ says Kensler. 

But where does that leave the rest of us, as we 
ponder what foods to buy? For the time being 
we are stuck with familiar advice — eat more 
fruits and vegetables, and more whole grains. 
It can't hurt: after all, despite the healthy eating 
campaigns, fewer Americans today are getting 
their five-a-day than were a decade ago (yet 
another factor that makes interpreting studies 
of fruit-and-vegetable consumption and can- 
cer rates more complicated). 

Better guidance might be on the way. “A 
lot of the current recommendations are very 
generic — five ‘helpings of fruits and vegeta- 
bles a day, but which ones are not specified,” 
notes Martin. “I think we'll get closer to saying, 
‘eat the ones that are purple” Future dietary 
recommendations might also take genetics 
into account. We might learn how to tweak 
the composition of the gut microbiota, such 
as with probiotics, to maximize the cancer- 
fighting effects of foods. Plant breeders and 
farmers might pay more attention to phyto- 
nutrient content when they develop and sow 
vegetable varieties. 

“Hopefully science can work together with 
the food industry to provide foods that are rich 
in components that are deemed healthy and 
safe;’ says Kensler. Perhaps then we will have 
clearer messages about how we can eat our way 
out of cancer danger. = 


Sarah DeWeerdt is a freelance science writer 
based in Seattle. 
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Interaction-based quantum metrology showing 
scaling beyond the Heisenberg limit 


M. Napolitano!, M. Koschorreck', B. Dubost!?, N. Behbood', R. J. Sewell’ & M. W. Mitchell! 


Quantum metrology aims to use entanglement and other quantum 
resources to improve precision measurement’. An interferometer 
using N independent particles to measure a parameter V can 
achieve at best the standard quantum limit of sensitivity, 61 « 
N_"?. However, using N entangled particles and exotic states’, 
such an interferometer*® can in principle achieve the Heisenberg 
limit, 5 « N~'. Recent theoretical work*~ has argued that inter- 
actions among particles may be a valuable resource for quantum 
metrology, allowing scaling beyond the Heisenberg limit. 
Specifically, a k-particle interaction will produce sensitivity 51 « 
N_* with appropriate entangled states and 5X « N~“~') even 
without entanglement’. Here we demonstrate ‘super-Heisenberg’ 
scaling of 5X « N~ 3/2 in a nonlinear, non-destructive®® measure- 
ment of the magnetization’®"’ of an atomic ensemble’’. We use fast 
optical nonlinearities to generate a pairwise photon-photon inter- 
action”’ (corresponding to k = 2) while preserving quantum-noise- 
limited performance”™*. We observe super-Heisenberg scaling over 
two orders of magnitude in N, limited at large numbers by higher- 
order nonlinear effects, in good agreement with theory’’. For a 
measurement of limited duration, super-Heisenberg scaling allows 
the nonlinear measurement to overtake in sensitivity a comparable 
linear measurement with the same number of photons. In other 
situations, however, higher-order nonlinearities prevent this cross- 
over from occurring, reflecting the subtle relationship between 
scaling and sensitivity in nonlinear systems. Our work shows that 
interparticle interactions can improve sensitivity in a quantum- 
limited measurement, and experimentally demonstrates a new 
resource for quantum metrology. 

The most precise instruments are interferometric in nature, and 
operate according to the laws of quantum mechanics. A collection of 
particles, for example photons or atoms, is prepared in a superposition 
state, allowed to evolve under the action of a Hamiltonian containing 
an unknown parameter, 1’, and measured in agreement with quantum 
measurement theory. The complementarity of quantum measure- 
ments’° determines the ultimate sensitivity of these instruments. 

Here we describe polarization interferometry, used, for example, in 
optical magnetometry to detect atomic magnetization'''*”; similar theory 
describes other interferometers’. A collection of N photons, with circular 
plus- and minus-polarization eigenstates, |+) and |—), is ea Ni 
single-photon Stokes operators s;=(1/2)(|+), |—))oi((+], ( 
where o; (i=, y, Z) are the Pauli matrices, gp is the identity in a 
superscript “I” denotes transposition. In macaenat quantum metrology, 
a Hamiltonian of the form H=hxv bee 4 gl) , where /i denotes Planck’s 
constant divided by 2n, uniformly and independently couples the 
photons to %, the parameter to be measured". If the input state consists 
of independent photons, the possible precision scales as 8X oc N~1/?, 
the shot noise or standard quantum limit (SQL). The factor of N~ M2 
reflects the statistical averaging of independent results. In contrast, 
entangled states can be highly, even perfectly, correlated, giving pre- 
cision limited by 5. ocN~!, the Heisenberg limit. 


The above Hamiltonian is conveniently written H=hx Ss where 
= pane li ) isa collective variable describing the net polarization of 
the photons. The independence of the photons manifests itself in the 
linearity of this Hamiltonian. Recently, it has been shown that inter- 
actions among particles, or, equivalently, nonlinear Hamiltonians, can 
contribute to measurement sensitivity and give scaling beyond the 
Heisenberg limit*. For example, a Hamiltonian H =hx. S that is, with 
a kth-order pepe awa in S= (Sp Sy ,S:), contains k- photon inter- 
action terms 3 1) @gl? @. -@S; The number of such terms, and, 
thus, the signal strength, grows as Na , but the quantum noise from the 
input states is unchanged. As a result, a sensitivity limit of 5X oc N~* 
applies when entanglement is used, and SX ocN~‘*~'/?) in the 
absence of entanglement’. For k= 2, this gives scaling better than 
the Heisenberg limit, so-called super-Heisenberg scaling’. We note 
that interactions and entanglement are compatible and both improve 
the scaling. The predicted advantage applies generally to quantum 
interferometry, and proposed mechanisms to produce metrologically 
relevant interactions include Kerr nonlinearities!*, cold collisions in 
condensed atomic gases’, Duffing nonlinearity in nanomechanical 
resonators’ and a two-pass effective nonlinearity with an atomic 
ensemble”. Topological excitations in nonlinear systems may also give 
advantageous scaling”’. 

In this Letter, we study interaction-based quantum metrology using 
unentangled probe particles. One challenge in demonstrating super- 
Heisenberg scaling is to engineer a suitable nonlinear Hamiltonian. 
Some nonlinearities have been shown to be intrinsically noisy’ 
whereas others give super-Heisenberg scaling but fall short of the ideal, 
8X ocN~*-1/2) | under realistic conditions’. We use a cold atomic 
ensemble as a light-matter quantum interface’* to produce quantum- 
noise-limited interactions, and use a Hamiltonian of the form 
H=hXS,8) =hXS,N /2. This Hamiltonian gives a polarization rota- 
tion that increases with photon number, without increasing quantum 
noise’. The experiment, shown schematically in Fig. 1, uses pulses of 
near-resonant light to measure the collective spin, F, of an ensemble of 
Na ~ 10° cold rubidium-87 atoms, probed on the 5S,/2— 5P3/2 Dz 
line. The experimental system is described in detail in refs 8, 23. The 
on-axis atomic magnetization, (F z)» which plays the role of ¥ in this 
measurement, is prepared in the initial state (F 2) = Na by optical 
pumping with resonant, circularly polarized light propagating along 
the trap axis, z. A weak, on-axis magnetic field is applied to preserve F, 
during the measurements. 

Pulses of S,-polarized, but not entangled, photons pass through the 
ensemble and experience an optical rotation Begg to io 2): The 


light-atom interaction Hamiltonian Hepp = 0Y ES, + po )ELSLN /2 
‘one this paramagnetic Faraday rotation’’. Both the linear term, 

YE Ses and the nonlinear term, B' YE F, S N/2, cause rotation of the 
ae of polarization from S, (vertical) towards S (diagonal). 
Detection of Sy then allows estimation of F,. As described in Sup- 
plementary fnigeiaationss ‘) and Bp“ depend on the optical detuning, 
A, relative to the F= 1—> F’ = 0 transition; in particular, « oft Ao )=0 
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Atoms 


Figure 1] Atom-light interface. a, Experimental schematic: an ensemble of 
7 X 10° ®’Rb atoms, held in an optical dipole trap, is prepared in the state 

|F = 1, mp = 1) by optical pumping (OP). Linear (P, P2) and nonlinear (Pyr) 
Faraday rotation probe pulses (in the order P;, Py, P2) measure the atomic 


for the specific detuning 4p ~ 27 X 468.5 MHz, allowing a purely non- 
linear estimation to be studied. 

The rotation angle is ¢= (E 2) [A(A)+ B(A)N]/2, where A x ot 
and B x $ both account for the temporal pulse shape and the geo- 
metric overlap between the atomic density and the spatial mode of the 
probe. The shot-noise-limited uncertainty in the rotation angle, due to 
quantum uncertainty in the initial angle, is 6 = N~ "?/2. A contribution, 
(F ) B(A)N /2, from initial number fluctuations 5N = (N) “” is neg- 
ligible for small rotation angles. This gives a measurement uncertainty of 


~ \ dp 1 
oie) $ — A(A)N1/? + B(A)N3/? () 


indicating a transition from SQL scaling, OF, « N~ 12 to super- 
Heisenberg scaling 5F, «x N *” with increasing N. 

We use two probing regimes. The ‘linear probe’ consists of 40 1-1s 
pulses (total illumination time, t, = 40 [1s) spread over 400 pts with 
detuning 4, >> 4. Together with the number of photons, N;, used 
in the experiment for the linear probe, this gives A >> NB, that is, 
linear estimation, and provides® a projection-noise-limited quantum 
non-demolition measurement” of E,, with uncertainty at the parts- 
per-thousand level®. The ‘nonlinear probe’ consists of a single, 
Gaussian-shaped, high-intensity pulse with a full-duration at half- 
maximum of ty; = 54ns, Ny, photons and a detuning Ao, such that 
A < Ny_B. Crucially, having two probes allows us to calibrate the 
nonlinear measurement precisely using a highly sensitive and well- 
characterized independent measurement of the same sample. 

We probe the same sample three times for each preparation. First we 
use the linear probe, which gives a precise and non-destructive mea- 
surement of (F a) via the rotation angle, dy. Then we use the nonlinear 
probe, which gives a rotation angle, (x that is calibrated against the 
‘true’ value (that is, with negligible error) provided by the linear probe. 
Finally we use a second linear probe to estimate the rotation angle ¢,), 
with which we can estimate the damage to the atomic magnetization, 
1 =1— ¢,//@ 1, caused by the nonlinear probe. 

The linear probe is calibrated using quantitative absorption imaging 
to measure Na, and we find that A(4,) = 3.3(1) X 10° rad per atom. 
The calibration of the nonlinear probe against the first linear probe is 
shown in Fig. 2: We repeat the above pump-probe sequence while 
varying Na in the range 1.5 X 10° to 3.5 X 10° to generate a 1 -vs-@N1. 
correlation plot for a given value of Ny. Because both #, and yz are 
linear a Na, we use linear regression to find the slope, b = déyz/ 
d¢y = B(Ap)Nnx/A(Az), for that value of Nyy. The experiment is 
hae for a range of different Nyy, values. 

The observed plot of b versus Nxz, shown in Fig. 2a, is well fitted by a 
simple model including saturation of the nonlinear response: 


déx, _ B(o)Nwi 1 
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magnetization, detected by a shot-noise-limited polarimeter (PM). The atom 
number is measured by quantitative absorption imaging (AI). b, Spectral 
positions of the pump, probe and imaging light on the D> transition. 


Here N&a =6.0(8) x 10’ is a saturation parameter and the non- 


linear coupling strength is B(A,) = 3.8(2) X 107° rad per atom per 
photon. 

The noise in the nonlinear probe, again as a function of Ny, is 
determined from the ¢,-vs-yz correlation plots. As illustrated in 
Fig. 2b, c, the residual standard deviation of the fits indicates the 
observed uncertainty, A@yz, which includes the intrinsic uncertainty, 
Sdn and a small open ue from electronic noise. In Fig. 3, we plot 
the fractional sensitivity, aE) / (F, )> versus Nwz, calculated using 
equation (2) and considering the whole polarized ensemble, with 
(F, ) =7 x 10° spins. In agreement with equation (1), the log-log slope 
indicates the scaling SF; NU) ocNyy’~ to within experimental un- 
certainties in the range Ny; = 10° to 10’, and super-Heisenberg scal- 
ing, that is, steeper than N “1 over two orders of magnitude 
(Nxt = 5 X 10° to 5 X 10’). 

Results of numerical modelling using the Maxwell-Bloch equations 
to describe the nonlinear light-atom interaction are also shown in 
Fig. 3. Two curves are shown, for detunings 4) + (2m X 200 kHz), 
covering the combined uncertainty in 4 due to the probe laser line- 
width and inhomogeneous light shifts in the optical dipole trap. As 
expected from equation (1), this alters the sensitivity only at low Nyy 
values. The model is described in detail in Supplementary Information. 
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Figure 2 | Calibration of nonlinear Faraday rotation. a, Ratio of the 
nonlinear rotation, yz, to the linear rotation, ¢ :, versus the nonlinear probe 
photon number, Nyy. The data points and error bars indicate best-fit and 
standard errors from a linear regression, ¢y, = bf, + const., for given values of 
Nyx. The red curve is a fit using equation (2), showing the expected nonlinear 
behaviour, dy, < Nyz, with some saturation for large values of Nyx. b, ¢, p1-vs- 
én correlation plots for two values of Nn. The atom number, Na, is varied to 
produce a range of #; and #yy, values. Green, no atoms (Na = 0); red, 

1.5 X 10°< Ny, <3.5 X 10°; blue, Na ~ 7 X 10°. The blue circles are shown as a 
check on detector saturation, and are not included in the analysis. 
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Figure 3 | Super-Heisenberg scaling. Fractional sensitivity, 3B / (F.), of 
the nonlinear probe plotted versus the number of interacting photons, Nyr. 
Blue circles indicate the measured sensitivity, orange curves show results of 
numerical modelling, and the black lines indicate SQL, Heisenberg-limit and 
super-Heisenberg (SH) scaling for reference. Scaling surpassing the Heisenberg 
limit, o Nxj!, is observed over two orders of magnitude. The measured damage 
to the magnetization, 7, shown as green circles, confirms the non-destructive 
nature of the measurement. Error bars for standard errors would be smaller 
than the symbols and are not shown. 


For photon numbers above Ny ~2 x 10’, the saturation of the non- 
linear rotation alters the slope. This can be understood as optical 
pumping of atoms into states other than |F = 1, mz = 1) by the non- 
linear probe. The damage to the atomic magnetization, 7 = 1 — ¢'/¢1, 
also shown in Fig. 3, remains small, confirming the non-destructive 
nature of the measurement. The finite damage even for small Nyy 
values is possibly due to stray light and/or magnetic fields disturbing 
the atoms during the 20-ms period between the two linear measure- 
ments. For large N, higher-order nonlinear effects including optical 
pumping limit the range of super-Heisenberg scaling. 

The experimental results illustrate the subtle relationship between 
scaling and sensitivity in a nonlinear system. For an ideal nonlinear 
measurement, the improved scaling would guarantee better absolute 
sensitivity for sufficiently large N values. Indeed, when the measure- 
ment bandwidth is taken into account, the nonlinear probe overtakes 
the linear one at N = 3.2 X 10°, where both achieve a sensitivity of 
1.1 X 10° spinsHz '. As a consequence, the nonlinear technique 
performs better in fast measurements. In contrast, when measurement 
time is not a limited resource, the comparison can be made on a 
‘sensitivity-per-measurement’ basis and the ideal crossover point, of 
3.2 X 10° spins at N= 8.7 X 10’, is never actually reached, owing to the 
higher-order nonlinearities. Evidently, super-Heisenberg scaling allows 
but does not guarantee enhanced sensitivity: for the nonlinear tech- 
nique to overtake the linear, it is also necessary that the scaling extend to 
large enough values of N. This example shows also that resource con- 
straints dramatically influence the comparison between the linear and 
nonlinear techniques. See also Supplementary Information. 

We have experimentally realized a system designed to achieve 
metrological sensitivity beyond the Heisenberg limit, 5.V oc N~ ', using 
metrologically relevant interactions among particles. To generate pair- 
wise photon-photon interactions, we use fast, nonlinear optical effects 
in a cold atomic ensemble and measure the ensemble magnetization, 
(F z)» with super-Heisenberg sensitivity 5F, oc N” *? To quantify the 
photon-photon interaction and the sensitivity rigorously, we calibrate 
against a precise, non-destructive, linear measurement of the same 
atomic quantity*®, demonstrate quantum-noise-limited performance 
of the optical instrumentation and place an upper limit on systematic, 
that is, non-atomic, nonlinearities at the level of a few per cent. The 
experiment demonstrates the use of interparticle interactions as a new 
resource for quantum metrology. Although possible applications to 
precision measurement will require detailed study, our experiment 
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shows that interactions can produce super-Heisenberg scaling and 
improved precision in a quantum-limited measurement. 


METHODS SUMMARY 


Linear and nonlinear probe light. The probe beam is aligned on the axis of the 
trap with a waist of 20 jtm, chosen to match the radial dimension of the cloud. In 
the linear probing regime, we use a train of 40 1-1s pulses, with a repetition rate of 
100kHz, each containing 3X 10° photons detuned by +1.5GHz from the 
F=1->F' =0 transition. The maximum intensity is 0.1 Wcm *. The signals 
are summed and can be considered a single, modulated pulse. 

The nonlinear probe consists of a single, Gaussian-shaped pulse with a full- 
duration at half-maximum of 54ns. The maximum intensity of the nonlinear 
probe is 7 Wcm ” for a pulse with 10’ photons. Theory predicts that a) = 0 at 
a detuning of A = 2m X 462 MHz in free space. This is modified by trap-induced 
light shifts, and we use the empirical value 4g = 2m X 468.5 MHz, which gives zero 
rotation at low probe intensity. 

Instrumental noise. The instrumental noise is quantified by measuring var(S,) 
versus input photon number N (that is, N; or Nyx), in the absence of atoms, to find 
contributions from electronic noise (V“" « N°), shot noise (N') and technical 
noise (x N’), as described in Supplementary Information. We find that the con- 
tributions from electronic noise to the linear vie? ) and nonlinear ( vo ) probes are 
3 X 10° and 4 X 10° per pulse, respectively, and that the technical noise is negligible. 
The instrumentation is thus shot-noise-limited over the full range of N used in the 
experiment. The intrinsic rotation uncertainty of the nonlinear probe, d@yz, is 
calculated from the measured Ady as (8, )” = (Ady)? — vee), The correction 
is at most 5%, 

Instrumental linearity. The linearity of the experimental system and analysis is 
verified by using a wave plate in place of the atoms to produce a linear rotation 
equal to the largest observed nonlinear rotation. Over the full range of photon 
numbers used in the experiment, the detected rotation angle is constant to within 
5%, and SQL scaling is observed. 


Received 31 July; accepted 15 December 2010. 


1. Giovannetti, V., Lloyd, S. & Maccone, L. Quantum metrology. Phys. Rev. Lett. 96, 
010401 (2006). 

2: itchell, M. W., Lundeen, J. S. & Steinberg, A. M. Super-resolving phase 

measurements with a multiphoton entangled state. Nature 429, 161-164 (2004). 

3. Lee, H., Kok, P. & Dowling, J.P. A quantum Rosetta stone for interferometry. J. Mod. 

Opt. 49, 2325-2338 (2002). 

4. Boixo, S., Flammia, S. T., Caves, C. M. & Geremia, J. Generalized limits for single- 

parameter quantum estimation. Phys. Rev. Lett 98, 090401 (2007). 

5. Choi, S. & Sundaram, B. Bose-Einstein condensate as a nonlinear Ramsey 

interferometer operating beyond the Heisenberg limit. Phys. Rev. A 77, 053613 

(2008). 

6. Roy, S. M. & Braunstein, S. L. Exponentially enhanced quantum metrology. Phys. 

Rev. Lett. 100, 220501 (2008). 

7.  Boixo,S. etal. Quantum metrology: dynamics versus entanglement. Phys. Rev. Lett. 
101, 040403 (2008). 

8. Koschorreck, M., Napolitano, M., Dubost, B. & Mitchell, M. W. Sub-projection-noise 
sensitivity in broadband atomic magnetometry. Phys. Rev. Lett. 104, 093602 
(2010). 

9. Koschorreck, M., Napolitano, M., Dubost, B. & Mitchell, M. W. Quantum 
nondemolition measurement of large-spin ensembles by dynamical decoupling. 
Phys. Rev. Lett. 105, 093602 (2010). 

0. Kominis, |., Kornack, T., Allred, J. & Romalis, M. A subfemtotesla multichannel 
atomic magnetometer. Nature 422, 596-599 (2003). 

11. Budker, D. & Romalis, M. Optical magnetometry. Nature Phys. 3, 227-234 (2007). 

12. Hammerer, K., Sorensen, A. S. & Polzik, E.S. Quantum interface between light and 

atomic ensembles. Rev. Mod. Phys. 82, 1041-1093 (2010). 

13. Napolitano, M. & Mitchell, M. W. Nonlinear metrology with a quantum interface. N. 

J. Phys. 12, 093016 (2010). 

14. Fleischhauer, M., Matsko, A. B. & Scully, M. O. Quantum limit of optical 

magnetometry in the presence of ac Stark shifts. Phys. Rev. A 62, 013808 (2000). 

15. Scully, M.0., Englert, B. G. & Walther, H. Quantum optical tests of complementarity. 

Nature 351, 111-116 (1991). 

16. Wasilewski, W. et a. Quantum noise limited and entanglement-assisted 

magnetometry. Phys. Rev. Lett. 104, 133601 (2010). 

17. Wolfgramm, F. et al. Squeezed-light optical magnetometry. Phys. Rev. Lett 105, 

053601 (2010). 

18. Beltran, J. & Luis, A. Breaking the Heisenberg limit with inefficient detectors. Phys. 

Rev. A 72, 045801 (2005). 

19. Woolley, M. J., Milburn, G. J. & Caves, C. M. Nonlinear quantum metrology using 
coupled nanomechanical resonators. N. J. Phys. 10, 125018 (2008). 

20. Chase, B.A., Baragiola, B. Q., Partner, H. L., Black, B. D. & Geremia, J. M. 
Magnetometry via a double-pass continuous quantum measurement of atomic 
spin. Phys. Rev. A 79, 062107 (2009). 

21. Negretti, A., Henkel, C. & Malmer, K. Quantum-limited position measurements of a 
dark matter-wave soliton. Phys. Rev. A 77, 043606 (2008). 


©2011 Macmillan Publishers Limited. All rights reserved 


22. Boixo, S. et al. Quantum-limited metrology and Bose-Einstein condensates. Phys. 
Rev. A 80, 032103 (2009). 

23. Kubasik, M. et al. Polarization-based light-atom quantum interface with an all- 
optical trap. Phys. Rev. A 79, 043815 (2009). 

24. Braginskii V.B.& Vorontsov, Y. |. Quantum-mechanical limitations in macroscopic 
experiments and modern experimental technique. Sov. Phys. Usp. 17, 644 (1975). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements We thank |. H. Deutsch and F. Illuminati for comments. We thank 
C.M. Caves and A. D. Codorniu for inspiration. This work was supported by the Spanish 


LETTER 


Ministry of Science and Innovation through the Consolider-Ingenio 2010 project QOIT, 
the Ingenio-Explora project OCHO (ref. FIS2009-07676-E/FIS) and project ILUMA (ref. 
FIS2008-01051), by the Marie-Curie RTN EMALI, and by Fundacio CELLEX Barcelona. 


Author Contributions All authors contributed equally to the work presented in this 
paper. 

Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of this article at 
www.nature.com/nature. Correspondence and requests for materials should be 
addressed to M.N. (mario.napolitano@icfo.es). 


24 MARCH 2011 | VOL 471 | NATURE | 489 


©2011 Macmillan Publishers Limited. All rights reserved 


Le aR 


doi:10.1038/nature09819 


A long noncoding RNA maintains active chromatin to 
coordinate homeotic gene expression 
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The genome is extensively transcribed into long intergenic non- 
coding RNAs (lincRNAs), many of which are implicated in gene 
silencing’*. Potential roles of lincRNAs in gene activation are 
much less understood**. Development and homeostasis require 
coordinate regulation of neighbouring genes through a process 
termed locus control®. Some locus control elements and enhancers 
transcribe lincRNAs”"’, hinting at possible roles in long-range 
control. In vertebrates, 39 Hox genes, encoding homeodomain 
transcription factors critical for positional identity, are clustered 
in four chromosomal loci; the Hox genes are expressed in nested 
anterior-posterior and proximal-distal patterns colinear with their 
genomic position from 3’ to 5’of the cluster'’. Here we identify 
HOTTIP, alincRNA transcribed from the 5’ tip of the HOXA locus 
that coordinates the activation of several 5’ HOXA genes in vivo. 
Chromosomal looping brings HOTTIP into close proximity to its 
target genes. HOTTIP RNA binds the adaptor protein WDR5 
directly and targets WDR5/MLL complexes across HOXA, driving 
histone H3 lysine 4 trimethylation and gene transcription. Induced 
proximity is necessary and sufficient for HOTTIP RNA activation 
of its target genes. Thus, by serving as key intermediates that trans- 
mit information from higher order chromosomal looping into 
chromatin modifications, lincRNAs may organize chromatin 
domains to coordinate long-range gene activation. 

We examined chromosome structure and histone modifications in 
human primary fibroblasts derived from several anatomic sites!”, and 
found distinctive differences in the HOXA locus. High throughput 
chromosome conformation capture (5C)"* across HOXA revealed that 
its higher order structure is dependent on positional identity. In ana- 
tomically distal cells (for example, foreskin and foot fibroblasts), we 
detected abundant chromatin interactions within the transcriptionally 
active 5’ HOXA locus (with reference to the directions of transcription 
of constituent Hox genes), pointing to a compact and looped con- 
formation. In contrast, no long-range chromatin interactions are 
detected within the transcriptionally silent 3’ HOXA which seems 
largely linear (Fig. 1a). Strikingly, anatomically proximal cells (for 
example, lung fibroblasts) have the diametrically opposite pattern. 
The ON and OFF states of Hox and other key developmental genes 
are maintained by the MLL/Trithorax (Trx) and polycomb group 
(PcG) proteins, which mediate trimethylation of histone H3 lysine 4 
(H3K4me3) to activate genes or lysine 27 (H3K27me3) to repress 
genes’. The portions of HOXA in tight physical interaction are 
marked by broad domains of H3K4me3, whereas H3K27me3 marks 
the physically extended and transcriptional silent regions (Fig. 1a). 

On the very 5’ and 3’ edges of the two respective interaction clusters 
are two lincRNA loci that exhibit distinct chromatin modifications. 
The 3’element has been previously identified as the myelopoiesis- 
associated lincRNA HOTAIRM1 (ref. 15). The 5’ element, for which 


we suggest the name HOTTIP for “HOXA transcript at the distal tip’, 
exhibits bivalent H3K4me3 and H3K27me3, a histone modification 
pattern associated with poised regulatory sequences’®. Comparison 
with RNA polymerase II occupancy and RNA expression showed that 
the bivalent H3K4me3 and H3K27me3 modifications on HOTTIP gene 
do not require HOTTIP transcription, but transcription of HOTTIP is 
associated with increased H3K4me3 and decreased H3K27me3 (Fig. 1a, 
left). Chromatin immunoprecipitation (ChIP) analysis confirmed that 
the HOTTIP gene is occupied by both polycomb repressive complex 2 
(PRC2) and MLL complex, consistent with the bivalent histone marks 
(Supplementary Fig. 1a). 

HOTTIP transcription yields a 3,764-nucleotide, spliced and poly- 
adenylated lincRNA that initiates ~330 bases upstream of HOXA13. 
Only the strand antisense to HOXA genes is transcribed (Supplemen- 
tary Fig. 1b). Genes near the 5’ end of each HOX cluster tend to be 
expressed in more posterior and/or distal anatomical locations. 
Consistent with its genomic location 5’ to HOXA13, HOTTIP is 
expressed in distal and/or posterior anatomic sites (Fig. 1b). In situ 
hybridization of developing mouse and chick embryos confirmed that 
HOTTIP is expressed in posterior and distal sites in vivo, indicating a 
conserved expression pattern from development to adulthood (Fig. 1c 
and Supplementary Fig. 1c). Even in distal cells where HOTTIP is 
expressed, its RNA level is very low and estimated to be ~0.3 copies 
per cell (Supplementary Fig. 2). 

We employed small interfering RNAs (siRNAs) to knock down 
HOTTIP RNA in fibroblasts from a distal anatomic site (foreskin), 
and examined expression of 5’ HOXA genes by quantitative reverse 
transcription PCR. Notably, HOTTIP RNA knockdown abrogated 
expression of distal HOXA genes across 40 kilobases with a trend 
dependent on the distance to HOTTIP. The strongest blockade was 
observed for HOXA13 and HOXA11, with progressively less severe 
effects on HOXA10, HOXA9 and HOXA7 (Fig. 2a). The effect on gene 
transcription appeared to be unidirectional, as there were no appre- 
ciable changes in the levels of EVX1, located ~40 kilobases 5’ of the 
HOXA cluster (data not shown). HOTTIP knockdown did not affect 
expression of the highly homologous HOXD genes, other control genes, 
nor induce antisense transcription at its own locus (Fig. 2b, Sup- 
plementary Fig. 3a). Several independent siRNAs targeting HOTTIP 
yielded similar results (Supplementary Fig. 3b). These results indicate 
that HOTTIP RNA is necessary to coordinate activation of 5’ HOXA 
genes. 

We next addressed the function of HOTTIP RNA in vivo in the 
developing chick limb bud (Fig. 2c). Whereas prior genetic studies 
of noncoding RNAs (ncRNAs) involved deletion or insertion into 
the gene locus’’, we wished to distinguish the functions of HOTTIP 
RNA from its corresponding DNA element. HOTTIP gene can nucleate 
H3K4 and H3K27 methylation independent of transcription (Fig. 1a), 
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Figure 1 | HOTTIP is a lincRNA transcribed in distal anatomic sites. 

a, Chromatin state map of distal versus proximal cells. Top panels, 
chromosome conformation capture-carbon copy (5C) analysis of distal 
(foreskin) and proximal (lung) human fibroblasts. Heat map representations 
(generated by my5C, ref. 30) of 5C data (bin size 30 kb, step size 3 kb) for HOXA 
in foreskin and lung fibroblasts. Red intensity of each pixel indicates relative 
interaction between the two points on the genomic coordinates. The diagonal 
represents frequent cis interactions between regions located in close proximity 


and the precise genomic distance between upstream enhancer ele- 
ments and Hox genes is critical for their proper colinear activation”. 
Therefore, we used RNA interference (RNAi) in chick embryos, where 
replication-competent retroviruses can deliver short-hairpin RNAs 
(shRNAs) with high penetrance and precise spatiotemporal control’* 
(Supplementary Fig. 4). In the limb bud, 5’ HoxA genes are transcribed 
in a nested pattern along the proximal-distal axis”’. In this tissue, HoxA 
function is highly redundant with that of the HoxD locus, which 
allowed us to assess altered HoxA expression patterns without major 
changes in anatomic landmarks”’. We injected retroviruses carrying 
shRNAs against chick HOTTIP into upper limb buds of stage 13 chicks; 
RT-PCR and in situ hybridization were performed on both control and 
knockdown samples after 2-4 days. Knockdown of HOTTIP RNA by 
two independent shRNAs in limb buds decreased expression of 
HoxA13, HoxA11 and HoxA10—again with a graded impact depend- 
ing on genomic proximity to HOTTIP gene. Vector control or an 
shRNA that fails to deplete HOTTIP RNA had little effect on Hox gene 
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along the linear genome. 5C signals that are away from the diagonal represent 
long-range looping interactions. Bottom panels, chromatin occupancy across 
HOXA. x-axis is genomic coordinate; y-axis depicts occupancy of the indicated 
histone marks or protein (ChIP/input). Box and arrows highlight chromatin 
states of HOTTIP gene. b, HOTTIP RNA expression in primary human 
fibroblasts from 11 anatomic sites. Means + s.d. are shown (n = 2). ¢, In situ 
hybridization of HOTTIP RNA in E13.5 mouse embryo. 


expression (Fig. 2d). In situ hybridization on whole embryos (Fig. 2e) 
and sections (Supplementary Fig. 5) revealed that HOTTIP RNA most 
strongly affects HoxA gene expression at the distal edge of the developing 
limb bud, where the 5’ HoxA genes are most strongly expressed. By 
stage 36, limbs depleted of HOTTIP RNA showed notable shortening 
and bending of distal bony elements, including the radius, ulna and 
third digit (~20% length reduction for each compared to contralateral 
and stage-matched limbs treated with control virus, P < 0.05, Student’s 
t-test, Fig. 2f). This phenotype resembled some of the defects in mice 
lacking HoxA11 and HoxA13 (refs 21-23). Together, these data indicate 
that HOTTIP RNA controls activation of distal Hox genes in vivo. 
The broad impact of HOTTIP RNA on gene activation across the 
HOXA locus is reminiscent of the broad domains of chromatin modi- 
fications demarcating active and silent chromosomal domains’”. 5C 
analysis of control and HOTTIP-depleted cells showed little change in 
higher order chromosomal structure, indicating that the chromo- 
somal looping is pre-configured and upstream of gene expression 
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Figure 2 | HOTTIPis required for coordinate activation of 5’ HOXA genes. __ each band. GAG signal confirms successful retroviral transduction in all cases. 
a, b, Knockdown of HOTTIP RNA abrogates expression of 5’ HOXA genesin _ e, In situ hybridization of 5' HoxA genes in chick limb buds. Arrowheads 
foreskin fibroblasts (a), but not HOXD or BID genes (b). Means + s.d. are highlight distal domains of high HoxA gene expression that are affected by 
shown (n = 3). GFP, green fluorescent protein. c, Schematic of chick RNAi HOTTIP knockdown. f, Shortening of distal bony elements in HOTTIP- 
experiment. d, HOTTIP RNA is required for 5' HoxA gene expression in vivo. depleted forelimbs. Alcian blue staining highlights the skeletal elements. Red 
RT-PCR of the indicated genes from control or HOTTIP-depleted distal limb —_ and purple lines highlight radius and 3rd digit lengths, respectively. 

bud is shown; quantification and normalization by Actin signal is shown below 


(Supplementary Fig. 6a). In contrast, HOTTIP RNA knockdown led to ‘peaks’ of occupancy near the transcriptional start sites (TSS) of mul- 
broad loss of H3K4me3 and H3K4mez2 across the HOXA locus, most _ tiple 5’ HOXA genes (Fig. 3b). Strikingly, HOTTIP RNA knockdown 
prominently over 5’ HOXA and HOTTIP gene itself (Fig. 3a, Sup- abrogated the peaks of MLL1 and WDRS occupancy near TSS, resulting 
plementary Figs 6b and 7). HOTTIP RNA knockdown also increased _ in diffuse and less intense binding of MLL1 and WDRS5 across HOXA 
H3K27me3 focally over HOTTIP gene, but had little impact on cluster, most prominently over the 5’ HOXA domain. HOTTIP RNA 
H3K27me3 across HOXA. These results indicate that HOTTIP RNA knockdown also led to increased accumulation of MLL1 and WDRS5 on 
is required for maintenance of H3K4me3 across the HOXA. These HOTTIP geneitself (Supplementary Fig. 8). Thus, HOTTIP RNA seems 
findings also imply that loss of 5’ HOXA gene transcription upon critical for maintaining a specific pattern of MLL complex occupancy 
HOTTIP RNA knockdown is likely to be due to loss of H3K4me3 across the HOXA locus to facilitate H3K4me3 and active transcription. 
(or other changes) rather than ectopic spread of H3K27me3. To define the molecular link between HOTTIP RNA and MLL com- 

H3K4 methylation of the HOX loci is carried out by the MLL family _ plex, we reasoned that HOTTIP RNA may physically interact with one 
of complexes”. In mammals, at least six MLL family members of SET- or more subunits of the MLL complex. Purified, in-vitro-transcribed, 
domain-containing lysine methyltransferases interact with acorecom- full-length HOTTIP RNA bound specifically to recombinant 
plex of WDR5, ASH2L, RBBP5, as well as with other proteins, for  glutathione-S-transferase-conjugated WDR5 (GST-WDRS), but not 
substrate recognition and genomic targeting™*. Genetic analyses indi- to GST, RBBP5, ASH2L, or the telomeric protein TRF1 (also known as 
cate that MLL1 and 2 are most essential for HOX gene expression in TERF1; Fig. 4a, b). The C terminus of MLLI, containing the SET 
fibroblasts”, and MLL] in particular is recruited to promoters of HOX domain, bound non-specifically to all RNAs, consistent with previous 
genes to maintain their activation states*®. In distally-derived human _ studies””. Immunoprecipitation of endogenous WDRS from two dif- 
fibroblasts, MLL1 and WDRS densely occupied extended region of the _ ferent cell lines each specifically retrieved endogenous HOTTIP RNA 
5’ HOXA cluster, coincident with the H3K4me3 domain, with specific (Fig. 4c), indicating that WDR5 and HOTTIP RNA interact in living 
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Figure 3 | HOTTIP RNA is required for the active chromatin state of 5’ 
HOXA cluster. a, Knockdown of HOTTIP RNA broadly decreases H3K4me3 
across 5’ HOXA locus but focally affects H3K27me3 at HOTTIP gene. Display 
is as in Fig. 1a. b, Knockdown of HOTTIP RNA abrogates peaks of MLL1 and 
WDRS occupancy near TSSs of 5’ HOXA genes and leads to accumulation of 
these proteins at HOTTIP gene itself. Arrows highlight peaks of MLL1 and 
WDRS occupancy; open arrowheads highlight chromatin state of HOTTIP 
gene upon HOTTIP RNA knockdown. 


cells. Immunoprecipitation of an epitope-tagged WDRS from a stable 
cell line that previously enabled stoichiometric purification of WDR5- 
interacting proteins* also specifically retrieved HOTTIP RNA 
(Supplementary Fig. 9). Knockdown of WDRS5 broadly inhibited 
expression of 5’ HOXA genes, and also abrogated HOTTIP transcrip- 
tion, demonstrating mutual interdependence between HOTTIP RNA 
and WDRS (Fig. 4d). 

HOTTIP RNA seems to regulate genes in cis, due to its low copy 
number, distance dependence of HOXA target gene activation on 
endogenous HOTTIP, and the physical proximity of HOTTIP and 
its target genes as seen in 5C. Indeed, ectopic expression of HOTTIP 
RNA by retroviral transduction of lung fibroblasts, which do not 
express HOTTIP, failed to activate expression of distal HOXA genes, 
and did not change H3K4me3 and H3K27me3 patterns across HOXA 
(Supplementary Fig 10). Moreover, in foreskin fibroblasts that express 
endogenous HOTTIP, ectopic HOTTIP expression did not induce 5’ 
HOXA genes, nor rescue the effects of depleting endogenous nascent 
HOTTIP RNA (Supplementary Fig. 11). The lack of response in fore- 
skin fibroblasts is notable because endogenous HOTTIP RNA is active 
in these cells, indicating that the protein partners of HOTTIP are all 
present and target genes are receptive. Ectopically expressed HOTTIP 
RNA, being transcribed from retroviral insertion sites scattered 
randomly in the genome, may not be able to find 5’ HOXA genes. In 
contrast, endogenous HOTTIP RNA is directly positioned near the 
5' HOXA genes by chromosomal looping, allowing interaction and 
control. 
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Figure 4| HOTTIP RNA programs active chromatin via WDRS5. 

a, Summary of RNA-protein interaction studies. Each of the indicated 
recombinant protein was purified and used to retrieve purified HOTTIP RNA 
or control histone RNA in vitro. Only GST-WDRS specifically retrieved 
HOTTIP. b, HOTTIP RNA binds directly and specifically to WDRS. Left, 
purified GST and GST-WDRS are visualized by SDS-PAGE and Coomassie 
Blue staining. Right, retrieved RNAs are quantified by (RT-PCR. c, HOTTIP 
RNA binds specifically to WDRS in cells. Immunoprecipitation (IP) of 
endogenous WDRS protein from PC3 (prostate) and T24 (bladder) carcinoma 
cells specifically retrieved HOTTIP, but not control IPs with IgG or chromatin 
binder SIRT6. U1 spliceosomal RNA served as negative control. d, WDRS is 
required for 5’ HOXA gene expression, including HOTTIP RNA. e, HOTTIP 
RNA recruitment potentiates transcription. Left, the BoxB tethering system. 
BoxB-RNA specifically binds AN fused to GAL4 DNA binding domain, 
recruiting the complex to a UAS-luciferase reporter gene. After transient 
transfection, IP of GAL4- AN specifically retrieves BoxB-HOTTIP RNA. Right, 
luciferase activity after co-transfection of the indicated constructs. *P < 0.05 
Student’s f-test comparing BoxB-LacZ versus BoxB-HOTTIP). Sloped triangle 
indicates increasing input of plasmids encoding ncRNAs. Means + s.d. (n = 3) 
are shown for all panels. 


To test the requirement of an exogenous targeting mechanism, we 
engineered an allele of HOTTIP RNA that can be artificially recruited 
to a reporter gene. Addition of five copies of the BoxB RNA element” 
to HOTTIP RNA allows the fusion transcript to be recruited to the AN 
RNA binding domain fused toa GAL4 DNA-binding domain (Fig. 4e). 
Recruitment of HOTTIP RNA to a silent GAL4 promoter is not suf- 
ficient to initiate transcription, but can significantly boost transcrip- 
tion if the promoter is also bound by WDRS and transcriptionally 
active (Fig. 4e). By uncoupling the sites of HOTTIP transcription 
versus HOTTIP RNA function, this experiment indicates that the 
proximity of HOTTIP RNA—rather than the act of transcription— 
maintains target gene expression. To further support the functionality 
of HOTTIP RNA, deletion analysis identified a ~1 kb domain in the 5’ 
of HOTTIP RNA (HOTTIP***"* '~”) that retains WDRS binding activity 
(Supplementary Fig. 12a). Enforced overexpression of HOTTIP’*°"’ '~* 
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in foreskin fibroblasts inhibited 5’ HOXA gene expression in an appar- 
ently dominant negative manner (Supplementary Fig. 12b). 

In summary, HOTTIP RNA isa key locus control element of HOXA 
genes and distal identity. Chromosomal looping brings HOTTIP RNA 
in close proximity to the 5’ HOXA genes. HOTTIP transcription acts as 
a switch to produce HOTTIP lincRNA, which binds to and targets 
WDRS5-MLL complexes to the 5’ HOXA locus, yielding a broad 
domain of H3K4me3 and transcription activation (Supplementary 
Fig. 13). The mutual interdependence between HOTTIP RNA and 
WDRS creates a positive feedback loop that maintains the ON state 
of the locus. These findings provide an integrated view linking three 
dimensional genome organization to dynamic programming of chro- 
matin states, and ultimately to developmental pattern formation. 

H3K4 methylation is a feature of almost all transcribed genes, and 
MLL family proteins are involved in many cell fate decisions in 
development and disease. Our findings suggest that additional 
lincRNAs, especially those associated with enhancers or enhancer-like 
activities* ‘°, may also be involved in gene activation by programming 
active chromatin states, and highlight WDR5 and other WD40 repeat 
proteins as candidate adaptors that link chromatin remodelling com- 
plexes to lincRNAs. Cis-restricted lincRNAs may be ideally suited to 
link chromosome structure and gene expression. Because such 
lincRNA can only act on its neighbours in space, information in higher 
order chromosomal looping can be faithfully transmitted to chromatin 
modification via RNA recruitment of enzymatic activities, and thus 
into gene expression. 


METHODS SUMMARY 


High throughput chromosome confirmation capture (5C) was performed on 
foreskin and lung fibroblasts, as well as foreskin fibroblasts treated with control 
or siRNA against HOTTIP RNA, as described’*. siRNA knockdown experiments 
on cultured human fibroblasts and qPCR were performed as described previ- 
ously’. ChIP-chip was performed as described’’ using ultra-high-density HOX 
tiling arrays. Full-length HOTTIP RNA was cloned by 5’ and 3’ rapid amplifica- 
tion of cloned/cDNA ends (RACE). Single-molecule RNA-fluorescent in situ 
hybridization (FISH) was performed using a pool of fluorescently-labelled oligo- 
nucleotides specific to HOTTIP RNA. In vivo HOTTIP RNA knockdown in chick 
was accomplished by microinjecting retroviruses carrying shRNA into prospective 
wing and leg buds and the animals were harvested at 2, 4 and 9 weeks post injection 
for RNA in situ hybridization, immunohistochemistry and whole-mount limb 
analysis, respectively. RNA-immunoprecipitation with WDR5 was performed as 
described’? Tethering experiments were done in 293T cells with co-transfections 
of various constructs containing a upstream activating sequence (UAS)-luciferase 
reporter, GAL4-WDRS, BoxB alone, and BoxB fused to full-length HOTTIP or 
full-length LacZ; cells were lysed 48 h after transfection and luciferase activity was 
determined. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Cells. Primary human fibroblasts derived from different anatomic sites were as 
described'**'~**, Primary human fibroblasts in culture retain their positional identity 
and have been used to examine chromatin states associated with positional memory, 
which have been confirmed in vivo***””. 
Chromatin immunoprecipitation followed by microarray analysis. ChIP-chip 
was performed using anti-H3K27me3 (Abcam), anti-H3K4me3 (Abcam), anti- 
H3K4me2 (Abcam), anti-histone H3 (Abcam), anti-PolII (Abcam), anti-MLL1 
(gift of R. Roeder), and anti-WDR5”* antibodies as previously described’. 
Chromatin from each indicated cell type or RNAi treatment is split into multiple 
tubes and subject to ChIP with different antibodies in parallel. Retrieved DNA and 
input chromatin were competitively hybridized to custom tiling arrays interrog- 
ating human HOX loci at 5-bp resolution as previously described’. 
5C analysis of the ENm010 HoxAI region. 5C primers were designed at HindIII 
restriction sites using 5C primer design tools previously developed’* and made 
available online at http://my5C.umassmed.edu (ref. 30). Reverse primers were 
designed for fragments overlapping a known transcription start site from 
GENCODE transcripts”, or overlapping a start site as experimentally determined 
by CAGE Tag data of the ENCODE pilot project*’. Forward primers were designed 
for all other HindIII restriction fragments. Primers were excluded if highly repeti- 
tive sequences prevented the design of a sufficiently unique 5C primer. Primers 
settings were: U-BLAST: 3; S-BLAST: 130: 15-MER: 1320; MIN_FSIZE: 40; 
MAX_FSIZE: 50000; OPT_TM: 65; OPT_PSIZE: 40. DNA sequence of the uni- 
versal tails of forward primers was CCTCTCTATGGGCAGTCGGTGAT; DNA 
sequence for the universal tails of reverse primers was AGAGAATGAGGAACC 
CGGGGCAG. A 6-base barcode was included between the specific part of the 
primers and the universal tail. In total 17 reverse primers and 90 forward primers 
were designed in the 500 kb HoxA1 locus (ENm010) and hence a total of 1,530 cis 
interaction were interrogated in this region. Primer sequences are available sepa- 
rately (Supplementary Table 1). 

3C was performed with HindIII as previously described” separately for fetal 
lung and foreskin fibroblasts (FB) and also for the control and HOTTIP knock- 
down foreskin FBs. For the 5C reaction, a total of 107 forward and reverse primers 
of HoxA1 region were mixed with either the ENCODE random region (ENr) 
primer pool comprising of 2,673 forward and 523 reverse primers (covering 30 
additional ENCODE regions) or the ENr313 primer pool comprising of 57 for- 
ward and 58 reverse primers (covering 1 additional ENCODE region). 5C was then 
performed in 10 reactions each containing an amount of 3C library that represents 
200,000 genome equivalents and 1 fmol of each primer. The 5C analysis of HoxA1 
region was carried out in two biological replicates of fetal lung and foreskin FBs. 5C 
ligation products were amplified using a pair of universal primers that recognize 
the common tails of the 5C forward and reverse primers described above and 
pooled together. To facilitate paired-end DNA sequence analysis on the Illumina 
GA2 platform, paired-end adaptor oligonucleotides were ligated to the 5C library 
using the Illumina PE protocol and PCR amplification of the library was carried 
out for 18 cycles with Illumina PCR primer PE 1.0 and 2.0. The 5C library was then 
sequenced on the Illumina GA2 platform generating 36 base paired end reads. For 
fetal lung FBs we obtained 7,625,276 and 10,947,424 mapped reads for two bio- 
logical replicates of which 1,339,861 and 242,301 could be specifically mapped 
back to interactions within ENm010 using Novoalign (http://www.novocraft. 
com), respectively. For two biological replicates of foreskin FBs we obtained 
7,311,386 and 5,731,107 mapped reads of which 2,752,789 and 66,769 could be 
mapped back to the ENm010 region, respectively. In the case of the knockdown 
study, control green fluorescent protein (GFP) knockdown foreskin FB 5C library 
yielded 4,909,482 mapped reads whereas HOTTIP knockdown foreskin FB had 
5,565,389 mapped reads of which 39,168 and 38,950 could be mapped back to 
ENm010 for control GFP and HOTTIP knockdown, respectively. In the set with 
fetal lung and foreskin fibroblast samples, 5C for ENm010 was multiplexed for 
deep sequencing with 5C of one other region, ENr313; in the set containing the 
knockdown samples, ENm010 was multiplexed with 5C of 30 other genomic 
regions. The different extent of multiplexing resulted in different number of 
sequencing reads mapping back to ENm010. In all instances the mappable reads 
were proportional to the degree of multiplexing, indicating equivalent library 
quality despite different read numbers. Supplementary Table 2 outlines the library 
composition of each experiment. The heat maps are scaled as follows—for Fig. la, 
distal (foreskin) FBs: 262-17,467, proximal (lung) FBs: 7-5,846; for Supplemen- 
tary Fig. 6, siGFP: 1-100, siHOTTIP: 1-100. Raw data from the 5C experiments 
used to generate the binned heat maps in Fig. 1a and Supplementary Fig. 6 can be 
found in Supplementary File 1. Raw data are available by request. 
HOTTIP cloning, sequence and expression analysis. We previously identified a 
portion of HOTTIP as a non protein-coding transcribed region named 
ncHOXA13-96 (ref. 12). This region also overlaps expressed sequence tag (EST) 
clone AK093987 that was previously observed to be expressed in cancer cell lines 


derived from posterior anatomic sites*’. 5’ and 3’ RACE (RLM Race kit, Applied 
Biosystems/Ambion) showed full-length HOTTIP RNA to be 3,764 nucleotides, 
extending the known transcribed region by more than 1,400 bases. BLAST and 
BLAT confirmed that portions of HOTTIP are well conserved in mammals and 
even in avians but had no protein coding potential. Full-length HOTTIP RNA 
sequence has been deposited at NCBI (accession number GU724873). qRT-PCR 
with SYBR Green was conducted as recommended by the manufacturer (Agilent 
Technologies). Primer sequences specific for HOTTIP were CCTAAAGCCACGC 
TTCTTTG (HOTTIP-F) and TGCAGGCTGGAGATCCTACT (HOTTIP-R). 
For Supplementary Fig. 11, endogenous nascent HOTTIP was distinguished from 
ectopic HOTTIP expressed from cDNA using primers that spanned intron-exon 
junctions. 

Strand-specific RT-PCR. RNA extracted from primary foreskin fibroblasts was 
reverse transcribed (SuperScript III, Invitrogen) using combinations of the previ- 
ously described HOTTIP-specific primers HOTTIP-F and/or HOTTIP-R as dia- 
grammed in Supplementary Fig. 1b. Resulting cDNA was then PCR-amplified using 
both HOTTIP-F and HOTTIP-R primers to visually determine strand specificity. 
HOTTIP transcript count per cell. The level of HOTTIP transcript per cell was 
calculated from the level of HOTTIP in 500,000 cells. Full-length HOTTIP in 
pcDNA3.1+ was assayed by qPCR using primers HOTTIP-F and HOTTIP-R 
at predetermined concentrations in triplicate to generate a linear amplification 
curve dependent on the moles of template DNA (Supplementary Fig. 2). The qRT- 
PCR value from 500,000 foreskin fibroblasts was determined and plotted, and the 
corresponding total molecules of transcript was divided by 500,000 to determine 
the approximate number of transcripts per cell. 

Single-molecule RNA fluorescence in situ hybridization (RNA-FISH). Single 
molecule RNA-FISH was performed as described in ref. 42 with the following 
modifications: the amount of hybridization solution per chamber was doubled to 
allow for proper coating of the chamber and the amount of glucose-oxidase buffer 
was tripled to assist in image acquisition. Images were acquired using an Olympus 
FV 1000 confocal microscope within 2 h of the addition of the glucose-oxidase buffer. 
RNA interference. Primary foreskin fibroblasts were transfected with siRNAs 
targeting HOTTIP and WDRS5 using Lipofectamine 2000 (Invitrogen) as per 
manufacturer's instructions. Total RNA was harvested 48-72h later using 
TRIzol (Invitrogen) and RNeasy Mini Kits (Qiagen) as previously described”. 
For the intronic HOTTIP knockdown experiment in Supplementary Figure 11, 
a pool of 10 siRNAs (Supplementary Table 3) targeting intronic regions in 
HOTTIP were transfected into foreskin fibroblasts, and RNA isolated as above. 
Generation of shRNAs against chicken HOTTIP. A reporter construct encoding 
a GFP-chicken HOTTIP fusion transcript was used in a small-scale screen to 
identify highly effective shRNA constructs. Eleven shRNAs targeting conserved 
regions of chicken HOTTIP were designed and inserted into the pSMP system 
(Thermo/Open Biosystems). The reporter construct and shRNA constructs were 
cotransfected into Phoenix cells, and HOTTIP transcript levels were analysed via 
reduced GFP fluorescence and by qRT-PCR. Three shRNAs that were effective in 
vitro were then cloned into RCAS vector for studies in chick embryos". 

Chick RNAi. RCAS HOTTIP hairpin and RCAS AP viruses were made by trans- 
fecting DF-1 cells with viral DNA. Transfected DF-1 cells were grown and passaged, 
after which the virus-containing supernatant was collected, concentrated and titred. 
Fertilized chicken eggs were incubated in a humidified rotating incubator at 37 °C 
until they reached Hamilton/Hamburger stage 10. Eggs were then windowed to 
expose the embryos. After gently removing the vitelline membrane, chicken 
embryos were microinjected with RCAS-HOTTIP hairpins and RCAS-AP viruses 
at the prospective wing and leg buds. All viral stocks have titres of | X 10°IU ml", 
and each limb was injected five times. The infected embryos were allowed to 
incubate at 37°C and were harvested 2 or 4 days after injection to detect viral 
infection by immunohistochemistry. Total RNA was extracted from injected fore- 
limbs, and RT-PCR analysis was performed 4 days after injection. Chicken 
embryos were harvested 9 days post-injection to carry out whole-mount Alcian 
blue staining. A total of 50 animals were injected. 

Hairpin sequences for chick HOTTIP were T@CTGTTGACAGTGAGCGAC 
CCGAAGATGTGTCTGATTTGTAGTGAAGCCACAGATGTACAAATCAGA 
CACATCTTCGGGCTGCCTACTGCCTCGGA (2-2-1), TG@CTGTTGACAGTG 
AGCGCCGCTCTGCTCTCCTCTCTCTCTAGTGAAGCCACAGATGTAGAG 
AGAGAGGAGAGCAGAGCGATGCCTACTGCCTCGGA (3-2-1), and TGCT 
GTTGACAGTGAGCGAATCCTTAATCGAATCTGATTTTAGTGAAGCCACA 
GATGTAAAATCAGATTCGATTAAGGATCTGCCTACTGCCTCGGA (4-4-1). 
HOTTIP overexpression. Full-length HOTTIP and a truncated transcript con- 
sisting of exons 1 and 2 (HOTTI *xons 1-2) Were cloned into the LZRS vector (gift 
of P. Khavari), and then transfected into Phoenix cells (gift of G. Nolan) to 
generate amphotropic retroviruses. Primary human fibroblasts were infected with 
either LZRS-full length HOTTIP (lung), LZRS-truncated HOTTIP (foreskin), or 
LZRS-GFP (both lung and foreskin), then passaged over 60 days, with periodic 
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testing of HOXA and HOTTIP expression by (RT-PCR. These cells were used in 
the rescue experiments depicted in Supplementary Fig. 11. 

GST pull-down. Full-length HOTTIP, truncated HOTTIP containing exons 1 and 
2 (HOTTIP®**"s !~?), and histone H2B1 mRNA were transcribed in vitro using T7 
polymerase according to manufacturer’s instructions (Promega), denatured, and 
refolded in folding buffer (100 mM KCl, 10 mM MgCL,, Tris pH 7.0). GST-tagged 
WDRS, C-terminal MLL1, RBBP5/Ash2L and TRF1 were expressed in Escherichia 
coliand purified as described**. Each GST-fusion protein was bound to glutathione 
beads (Amersham/GE Healthcare) and blocked with excess yeast total mRNA in 
PB100 buffer (20mM HEPES pH7.6, 100mM KCI, 0.05% NP40, 1mM DTT, 
0.5 mM PMSF) for 1 h at room temperature. Beads were then incubated with either 
in-vitro-transcribed HOTTIP or histone H2B1 mRNA for 45 min at room tem- 
perature. After three washes in PB200 buffer (20 mM HEPES pH 7.6, 200 mM KCl, 
0.05% NP40, 1mM DTT, 0.5mM PMSF), bound RNAs were extracted and ana- 
lysed by qRT-PCR, as previously described. 

RNA immunoprecipitation. HeLa-WDR5-Flag cells: 48h after Lipofectamine 
2000-mediated transfection of HOTTIP into HeLa WDR5-Flag cells (approxi- 
mately 10’ cells), total protein was extracted as previously described, with modi- 
fications“. Briefly, cells were resuspended in Buffer A (10mM HEPES pH7.5, 
1.5mM MgCl, 10 mM KCl, 0.5 mM DTT, 1.0mM PMSF), lysed in 0.25% NP40, 
and fractionated by low speed centrifugation. The nuclear fraction was resus- 
pended and lysed in Buffer C (20 mM HEPES pH 7.5, 10% glycerol, 0.42 M KCl, 
4mM MgCh, 0.5mM DTT, 1.0mM PMSF). Combined nuclear and cytoplasmic 
fractions were immunoprecipitated with mouse anti-Flag M2 monoclonal antibody 
(Sigma) or mouse IgG affixed to agarose beads (Sigma) for 3 to 4h at 4°C. Beads 
were washed four times with wash buffer (50mM TrisCl pH7.9, 10% glycerol, 
100 mM KCl, 5 mM MgCl, 10 mM f-mercaptoethanol, 0.1% NP40). After elution 
using Flag peptide (Sigma), co-immunoprecipitated RNA was extracted and ana- 
lysed by qRT-PCR. 

Endogenous WDRS and SIRT6 RIP: cellular fractions were isolated as above and 
incubated with the anti- WDR3 (ref. 31) or anti-Sirt6 (ab62739, Abcam) antibodies 
overnight at 4 °C. Samples were washed in wash buffer, and co-immunoprecipitated 
RNA was extracted and analysed by qRT-PCR. 

RNA chromatography. Full-length in-vitro-transcribed HOTTIP RNA was con- 
jugated to adipic acid dehydrazide agarose beads as described**. The complexed 
beads were incubated with whole cell lysates from Hela WDR5-Flag cells, washed, 
and bound proteins visualized by western blotting. 

BoxB tethering assay. 293T cells were grown to about 50% confluence in 6-well 
plates on the day of transfection. Using Lipofectamine 2000 (Invitrogen), a plasmid 
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encoding a luciferase gene under the control of five tandem GAL4 UAS sites were 
co-transfected with plasmids encoding GAL4-WDR5, GAL4-AN (the 22 amino 
acid RNA-binding domain of the lambda bacteriophage antiterminator protein N) 
peptide fused to a C-terminal GFP tag, BoxB (containing five repeats of the AN- 
specific 19 nucleotide binding site), BoxB fused to full-length LacZ, or BoxB fused 
to full-length HOTTIP. Cells were lysed 48 h after transfection, and luciferase assay 
kit (Promega) was used to determine relative levels of the luciferase gene product, 
following the manufacturer’s protocol. 
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Molecular regulation of sexual preference revealed 
by genetic studies of 5-HT in the brains of male mice 


Yan Liu'?*, Yun’ai Jiang'**, Yunxia Si’, Ji- Young Kim*, Zhou-Feng Chen* & Yi Rao® 


Although the question of to whom a male directs his mating 
attempts’ is a critical one in social interactions, little is known 
about the molecular and cellular mechanisms controlling mam- 
malian sexual preference. Here we report that the neurotransmitter 
5-hydroxytryptamine (5-HT) is required for male sexual preference. 
Wild-type male mice preferred females over males, but males lack- 
ing central serotonergic neurons lost sexual preference although 
they were not generally defective in olfaction or in pheromone 
sensing. A role for 5-HT was demonstrated by the phenotype of 
mice lacking tryptophan hydroxylase 2 (Tph2), which is required 
for the first step of 5-HT synthesis in the brain. Thirty-five minutes 
after the injection of the intermediate 5-hydroxytryptophan 
(5-HTP), which circumvented Tph2 to restore 5-HT to the wild- 
type level, adult Tph2 knockout mice also preferred females over 
males. These results indicate that 5-HT and serotonergic neurons 
in the adult brain regulate mammalian sexual preference. 

Interactions between members of the opposite sex are essential for 
sexually reproducing animals. Evolutionary benefits have been pro- 
posed for homo- and bisexual traits’’, which exist in many animals” 
from American bulls* to Japanese rhesus monkeys’. Studies of animals 
with different sexual preferences are essential for understanding the 
seemingly simple decision of a male to court a female. 

Research in Drosophila has uncovered genes required for Drosophila 
courtship preference, but none of their homologues have been shown 
to affect mammalian sexual preference. Research in mammals has 
demonstrated that pheromone sensing in the periphery is important 
for sexual preference. Male mice lacking Trpc2 (Trpc2-'~), which 
encodes a channel expressed in the vomeronasal organ, mounted other 
males, emitted ultrasonic vocalizations (USVs) towards males and 
were less aggressive towards males”®. However, understanding of the 
central mechanisms for sexual preference remains limited. 

The neurotransmitter 5-HT has been implicated in male sexual beha- 
viours such as erection, ejaculation and orgasm in mice and humans’*. 
Depletion of 5-HT by treating animals with p-chlorophenylalanine 
(pCPA) or tryptophan-free diets induced male-male mounting’”’. 
However, pCPA treatment was thought to increase sexual activity 
whereas its effect on sexual preference has not been investigated. 
Interpretation of pCPA results was complicated further by the lack of 
specificity: pCPA may affect noradrenaline and dopamine at higher 
concentrations””. 

Almost all serotonergic neurons in the brain were missing from 
embryogenesis to adulthood in Lmx1b conditional knockout mice in 
which the floxed Lmx1b allele was deleted by ePet1-Cre'’*. We compared 
the behaviours of male mice of different genotypes: ePet1-Cre/Lmx1b"*/ 
Lmx1b"* as homozygous mutants (Lmx1b /~ ); their littermates ePet1- 
Cre/Lmx1b""* as heterozygous mutants (Lmx1b'’); and Lmx1b"*/ 
Lmx1b"* without ePet1-Cre as the wild type (Lmx1b*’*). We also used 
ePet1-Cre without Lmx1b"* as a control. 

We tested first how a male responded in his home cage when a wild- 
type target C57 male was introduced. Compared to the ePet1-Cre, 


Lmx1b*/* and Lmx1b*’~ controls, Lmx1b~/~ mice showed signifi- 
cantly more mounting of male intruders (Fig. 1 and Supplementary 
Movie 1; see Supplementary Data 1 for numbers of mice used and 
statistics for all figures). The percentage of males who mounted target 
males was significantly higher in Lmx1b~/~ males than ePet1-Cre, 
Lmx1b*/~ and Lmx1b*/* males (Fig. 1a). Lmx1b-/~ males mounted 
with a shorter latency (Fig. 1b), higher frequency (Fig. 1c) and longer 
duration (Fig. 1d). These results show that the absence of serotonergic 
neurons in the brain increased male-male mounting. 

A sexually dimorphic behavioural response of males is to emit 30- 
110kHz USVs when they encounter female mice or pheromones, 
which may function as love songs to facilitate female receptivity’*. 
Lmx1 an Lmxib*/~ and Lmx1b /~ males were similar in USV emis- 
sion towards females (Fig. le-g). However, the percentage of Lmx1b‘~ 
males emitting USV towards males was significantly higher than that of 
ePetl-Cre, Lmx1b*’* or Lmx1b*/~ males (Fig. 1f). Numbers of USV 
‘syllables’ emitted towards females were similar among ePet1-Cre, 
Lmx1b*/*, Lmx1b*/~ and Lmx1b~/~ males (Fig. 1g). Lmx1b-/~ males 
emitted more USV ‘syllables’ towards males than ePet1-Cre, Lmx1b*/*+ 


and Lmx1b‘’~. The number of USV emissions by Lmx1b~/~ males 
towards males was approximately 720 times higher than that of 
Lmx1b*’* males (Fig. 1g). 


Although Lmx1 b-’~ males still emitted more USVs towards females, 
the preference for females over males was significantly reduced: the 
ratio of USVs towards females over that for males was only 3 for 
Lmx1b ‘~ males, significantly reduced from 1,002 for ePet1-Cre males, 
2,438 for Lmx1b*/* males and 52 for Lmx1b*’~. 

In the mating choice assay, an oestrous female C57 target mouse and 
a sexually naive male C57 target mouse were introduced into the home 
cage of a test male. Wild-type males preferred to mount female targets 
(Fig. 2a): a higher percentage of Lmx1b*’~ (or ePetl-Cre, Lmx1b*’~ ) 
males mounted female targets than male targets (Supplementary Movie 
2). However, the percentage of Lmx1b ’~ males mounting females was 
not significantly different from that mounting males. ePetl-Cre, 
Lmx1b*/* and Lmx1b*/~ males mounted female targets with a shorter 
latency, higher frequency and longer duration than male targets 
(Fig. 2b, d, e), whereas Lmx1b~/~ males mounted males and females 
with similar latencies, frequencies and durations (Supplementary 
Movies 2 and 3). Thus, elimination of serotonergic neurons led to a 
loss of sexual preference in mounting. 

Further analyses were carried out to detect a change in sexual pref- 
erence separate from an increase in sexual drive: (1) in the mating 
choice assay, all ePet1-Cre, Lmx1b*/* and Lmx1b*’~ males mounted 
females before males, whereas 46.2% of Lmx1b /~ mounted males first 
(Fig. 2c); (2) the mounting frequency ratio of Lmx1 b-/~ males in the 
mating choice assay (female mounting frequency — male mounting 
frequency)/(female + male mounting) (that is, (Q — 0/0’ + Q)) was 
significantly different from ePet1-Cre, Lmx1b*’* and Lmx1b*’~ males 
(Fig. 2f); and (3) when a test male was presented only with an oestrous 
female target, Lmx1b-’" males were not statistically significant 
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Figure 1 | Male-male mounting and USV by mice lacking central 
serotonergic neurons. a-g, Numbers of mice used and statistical analysis are 
all included in Supplementary Data 1. a-d, A test male was presented in its 
home cage with an adult wild-type male and its behaviour was recorded for 
30 min (all data shown as mean + s.e.m.). Compared with Lmx1b*/*, 
Lmx1b*’~ or ePetl-Cre, Lmx1b /~ males mounted males at a higher 
percentage (a), lower latency (b), higher frequency (c) and for a longer duration 
(d). e, Typical USV patterns emitted by males when presented with female or 
male intruders. The two left panels show USVs in 2 min, whereas the two right 


different from wild-type and heterozygous males in male-female 
mounting (Supplementary Fig. 1). 

We tested male mice for their preference of pheromones present in 
the genitals or the bedding. In the genital odour preference assay", a 
slide with one half smeared with female genitals and the other half with 
male genitals was presented to a test male. The total time spent sniffing 
both halves of the slide was reduced in Lmx1b~‘~ males (Supplemen- 
tary Fig. 2a). Lmx1b*/* and Lmx1b*’~ littermates spent significantly 
more time sniffing female than male genital odour, whereas Lmx1b /~ 
males spent equal time sniffing female and male genital odours 
(Fig. 3a). Lmx1b*/*, Lmx1b*/— and Lmx1b~/~ were similar in the 
amount of time spent sniffing male genital odour. Female genital 
odour sniffing time was less in Lmx1b-/~ males than in Lmx1b*'* 
and Lmx1b*'~ littermates (Fig. 3a). The genital odour preference ratio 
(Q —O/0' + Q) of Lmx1b~'~ males was significantly lower than those 
of Lmx1b*/* and Lmx1b*'~ males (Fig. 3b). Compared with 
Lmxi1b*'* and Lmx1b*'~ males, a significantly higher percentage 
(62.5%) of Lmx1b/~ males spent more time sniffing male than female 
genital odour (Fig. 3c). 

In the bedding preference assay"®, the total time spent over male and 
female bedding was similar among ePet1-Cre, Lmx1bt/*, Lmx1bt!~ 
and Lmx1b~'~ males (Supplementary Fig. 2b). ePet1-Cre, Lmx1b*’* 
and Lmx1b*’~ males spent significantly more time above female than 
male bedding whereas Lmx1b ‘~ males spent equal time above female 
and male beddings (Fig. 3d). Compared with ePet1-Cre, Lmx1b*’* 
and Lmx1b*/~ males, Lmx1b~/~ males spent more time above male 
bedding and less time above female bedding. The bedding preference 
ratio of Lmx1b ‘~ males was significantly lower than those of ePet!- 
Cre, Lmx1b*/* and Lmx1b*’~ males (Fig. 3e). The percentage of 
males who spent more time above male bedding was significantly 
higher in Lmx1 b-’~ males (58.8%) than those in ePet1-Cre (0%), 
Lmx1b*/* (6.3%) or Lmx1b*!~ (12.5%) males (Fig. 3f). 
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panels show parts of USV graphs at higher magnifications. f, Female intruders 
elicited USV from almost all ePetl-Cre, Lmx1b/~, Lmx1b*/*, or Lmx1b*/— 
males . Male intruders elicited USVs more from Lmx1b~/~ males than from 
ePet1-Cre, Lmx1b*/* or Lmx1b*’~ males. g, The number of USVs emitted by 
Lmxib ‘~ males towards males is higher than those by ePet1-Cre, Lmx1b*/* 
or Lmx1b*/~ males, whereas ePet1-Cre, Lmx1b*/*, Lmx1b*/~ and Lmx1b /— 
males were similar in USVs towards females. *P < 0.05, **P < 0.01, 

**ED < 0.001. 


Thus, in both the genital odour and bedding assays, Lmx1b ‘~ 
males had lost preference for female pheromones over male phero- 
mones: in the genital odour preference assay, Lmx1b~'~ males showed 
decreased sniffing time for female genital odour; in the bedding pref- 
erence assay, Lmx1b ‘~ males showed increased time spent over male 
bedding and decreased time over female bedding. 

Multiple assays involving odour or pheromone sensing were carried 
out to test for possible changes in olfaction. In the sesame oil pref- 
erence assay’”, Lmx1b*/* and Lmx1b ‘~ males were indistinguishable 
in spending significantly more time with sesame than air (Supplemen- 
tary Fig. 3a). In the fox urine avoidance assay’’, Lmx1b"/* and 
Lmx1b '~ males were also similar (Supplementary Fig. 3b). Thus, 
Lmx1b'~ males were not defective in either innate attractive or avoid- 
ance response. 

In the social approach assay’, Lmx1b‘’* and Lmx1b-'~ males 
were similar in spending more time close to a strange male than the 
empty chamber (Supplementary Fig. 3c). 

In the social recognition assay”, Lmx1b*/* and Lmx1b~‘~ males 
spent a similar amount of time exploring the first intruder at initial 
presentation, displayed social habituation towards the familiar 
intruder over the next three presentations and displayed dishabitua- 
tion when a new intruder was introduced (Fig. 4a). 

An operant conditioning assay was used to test whether Lmx1b‘~ 
males could distinguish between male and female pheromones”'. Two 
arms of a T maze were supplied with the odour of either female or male 
urine. Electroshock was applied in such a way that the test mice had to 
run or stay in the same arm depending on the urine. Over 3 days of 
training, Lmx1b‘’* and Lmx1b ‘~ males were similar in learning to 
avoid punishment (Fig. 4b). Thus, no olfactory defects for general 
odours or pheromones were detected in Lmx1b ‘~ males. 

Results from Lmx1b-‘~ mice indicate a role for serotonergic neu- 
rons. To study the role of 5-HT, we used mice unable to synthesize 
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Figure 2 | Lack of sexual preference by mice without central serotonergic 
neurons. a-—f, Each test male was presented with a male and an oestrous female, 
and its mating choice was analysed for 15 min. a, More ePet1-Cre, Lmxib*/* 
and Lmx1b*’~ males mounted female than male targets. A similar percentage 
of Lmx1b/~ males mounted females and males. b, ePet1-Cre, Lmx1b*/* and 
Lmxib‘’~ males mounted female targets faster than male targets. Mounting 
latencies of Lmx1b /~ males for females and males were similar. c, More than 
40% of Lmx1b~/~ males but none of the ePet1-Cre, Lmx1b*/* or Lmx1b*/— 
males chose a male as their first mounting target. d, ePet1-Cre males mounted 
females significantly more often than males, as did Lmx1b*/* and Lmx1b*/~ 
males. Lmx1b /~ males mounted females as often as males (P > 0.05, t-test). 
e, ePet1-Cre males spent more time mounting females than males, as did 
Lmx1b*/* and Lmx1b*/~ males. Lmx1b~/~ males did not show differences in 
mounting males or females. f, The mounting frequency ratio of Lmx1b /~ was 
different from that of ePet1-Cre, Lmx1b‘/* and Lmx1b*/~. *P <0.05, 

**P < 0.01, ***P < 0.001. 


5-HT in the brain. 5-HT is synthesized in two steps: tryptophan is 
converted by a Tph into 5-HTP, which is converted into 5-HT by 
5-hydroxytryptophan decarboxylase and aromatic L-amino-acid 
decarboxylase. 

There are two Tph enzymes: Tph2 is required centrally and Tph1 
peripherally. We have generated Tph2~'~ mice (J.-Y.K. et al., manu- 
script in preparation), which were viable **. High-performance liquid 
chromatography (HPLC) analysis showed that the 5-HT level was 
significantly reduced in the brains of Tph2~‘~ males (Supplemen- 
tary Fig. 4a). Male-male mounting (Supplementary Movie 4) was 
significantly higher in Tph2-‘~ males than either Tph2*’* or hetero- 
zygous Tph2*’~ males: the percentage was significantly higher, dura- 
tion longer, latency shorter and frequency higher (Supplementary Fig. 
4b, c and Fig. 5a, b). In the bedding preference assay, both Tph2*! = 
and Tph2‘’” males preferred female over male bedding, whereas 
Tph2 ‘~ males showed no preference (Fig. 5c). In the genital odour 
preference assay, both Tph2‘’* and Tph2*’~ males preferred female 
over male genital odour, but Tph2-‘~ males showed no preference 
(Fig. 5d). 

When presented with an oestrous female target, male-female 
mounting was not significantly changed in Tph2 ‘~ males (Sup- 
plementary Fig. 5). In mating choice, Tph2~’~ males had lost pref- 
erence for females over males in percentage, latency, frequency and 
duration (Supplementary Fig. 6a, b, d, e). No control males mounted 
target males before females, whereas more than 40% of Tph2-’~ males 
mounted males first (Supplementary Fig. 6c). The mounting frequency 
ratio of Tph2-’~ males was significantly different from those of 
Tph2‘’* and Tph2*’~ males (Supplementary Fig. 6f). 

Lmx1b-‘~ and Tph2~/~ mice lack 5-HT from embryogenesis. To 
study the role of 5-HT in adulthood, we took two complementary 
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Figure 3 | Loss of sexual preference for genital odour and bedding by males 
without central serotonergic neurons. a, Lmx1b‘’* males spent more time 
sniffing female than male genital odour, as did Lmx1b'/~ males. Lmx1b /~ 
males spent a similar amount of time on female and male genital odour. Three 
groups were not significantly different in male genital odour sniffing time but 
Lmxib ‘~ males spent less time sniffing female genital odour than the other 
two groups. b, Sniffing ratio of Lmx1b~/~ males was significantly different from 
Lmxib*’* and Lmx1b ‘“~ males (P< 0.05 for Lmx1b*’* versus Lmx1b /-, 
P<0.05 for Lmx1b*’~ versus Lmx1b ’~, P> 0.05 for Lmx1b*’* versus 
Lmx1b*"; one-way ANOVA). ¢, Compared with Lmx1 bt/* andLmx1b*’,a 
higher percentage of Lmx1b ’~ males spent more time sniffing male than 
female genital odour. d, ePet1-Cre males spent more time above female bedding 
than male bedding, as did Lmx1b*/* and Lmx1b*’~ males. Lmx1b/~ males 
spent a similar amount of time above female and male bedding. Compared with 
ePetl-Cre, Lmx1b*/— and Lmx1b*’*, Lmx1b ‘~ males spent less time above 
female bedding but more time above male bedding. e, The bedding time ratio of 
Lmx1b ‘~ was different from ePet1-Cre and Lmx1b*’~. f, Compared with 
ePet1-Cre, Lmx1b*/* and Lmx1b*’-, a significantly higher percentage of 
Lmx1b/~ males spent more time above male bedding. *P < 0.05, **P < 0.01, 
***P < 0,001. 


approaches: first, we depleted 5-HT from adult mice pharmacologi- 
cally with pCPA”*; then we attempted to rescue the phenotype of adult 
Tph2-‘~ mutants. 

Adult C57BL/6] males were injected with either pCPA or saline for 
three consecutive days. 5-HT level was significantly reduced by pCPA 
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Figure 4 | Odour discrimination. a, Both Lmx1b*/* and Lmx1b/~ males 
showed habituation and dishabituation in sniffing time. No statistical 
difference was found between Lmx1b*’* and Lmx1b ‘~ males at any point. 
b, After seven training sessions with male and female urine, no significant 
difference was detected between Lmx1b*/* and Lmx1b’’~ males at any point. 
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Figure 5 | Brain chemistry and behaviours of Tph2 knockout males. 
a, b, Compared with Tph2*’ * and Tph2*/ as Tph2 / ~ males showed a shorter 
latency (a) and higher frequency in mounting males (b). c, Both Tph2‘’* and 
Tph2‘’~ males significantly preferred female over male bedding, whereas 
Tph2 ‘~ males did not show a preference between male and female bedding. 
d, Both Tph2*’* and Tph2‘’~ males significantly preferred female over male 
genital odour, whereas Tph2 ’~ males did not show a preference between male 
and female genital odour. *P < 0.05, **P < 0.01, ***P < 0.001. 


(Supplementary Fig. 7). pCPA-treated males showed shorter latency, 
higher frequency and longer duration than control males in mounting 
target males (Supplementary Fig. 8a—d), and lost bedding preference 
(Supplementary Fig. 8e, f). 

To test whether 5-HTP injection into adult mice could rescue the 
Tph2-‘~ phenotype, we examined first whether 5-HTP could rescue 
5-HT synthesis in Tph2-’~ males and found that 5-HT levels were 
restored 35 min after intraperitoneal injection of 5-HTP but not saline 
(Fig. 6a and Supplementary 9a, b). 

5-HTP significantly reduced male-male mounting of Tph2‘~ 
males: the percentage was decreased, latency increased, frequency 
decreased and duration shortened; all returning to wild-type levels 
(Fig. 6b, c and Supplementary Fig. 9c, d). 5-HTP rescued the loss of 
sexual preference in mounting latency, frequency and duration in the 
mating choice assay (Supplementary Fig. 10a—c) and the bedding pref- 
erence of Tph2-’~ males (Fig. 6d and Supplementary Fig. 9e). 

When a test male was presented with a target female, Tph2‘~ males 
were similar to wild-type and heterozygous males in mounting per- 
centage, latency, frequency and duration (Supplementary Figs 5, 11). 
5-HTP injection into Tph2-’~ males did not affect male-female 
mounting (Supplementary Fig. 11), although 5-HTP injection into 
wild-type males reduced male-female mounting. Because 5-HTP 
injection in wild-type males increased the level of 5-HT beyond the 
wild-type level (Supplementary Fig. 9a, b), it indicated a dosage- 
sensitive effect of 5-HT: 5-HT at concentrations above the wild-type 
level inhibited male-female mounting, but 5-HT concentrations 
between the wild-type and Tph2~’~ levels did not affect male-female 
mounting. 

We conclude that central serotonergic signalling is crucial for male 
sexual preference in mice. This is the first time, to our knowledge, that 
a neurotransmitter in the brain has been demonstrated to be important 
in mammalian sexual preference. Previous studies in mammals have 
implicated 5-HT and dopamine in male sexual behaviours, but neither 
has been demonstrated to have any role in sexual preference: dopa- 
mine is thought to facilitate male sexual behaviours whereas 5-HT is 
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Figure 6 | 5-HTP rescue of chemical and behavioural deficits in Tph2 
knockout mice. a, Levels of 5-HT and 5-hydroxyindoleacetic acid (5-HIAA) 
were analysed in Tph2*’* and Tph2 ’~ males 35 min after injection of either 
5-HTP (40 mgkg’ ' body weight) or control saline. b, c, Male-male mounting 
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thought to inhibit sexual behaviours’ ''”°. Our studies have established 
a role for 5-HT in male sexual preference. Multiple results showed a 
loss in sexual preference beyond or separate from hypersexuality: (1) 
the ratio of male-male and male-female interactions was repeatedly 
measured to analyse sexual preference (Figs 2f, 3b, e, 5c, d, 6d and 
Supplementary Figs 6f, 8f, 9e, 10d); (2) Lmx1b~'~ males showed 
increased USVs towards males but not towards females (Fig. 1g); (3) 
in mating choice, the latency, frequency and duration of Lmx1b-/~ 
males to mount males, but not to mount females, was changed (Fig. 2a, 
b, d, e); (4) in bedding preference, Lmx1 b (Fig. 3d) and Tph2'~ 
males (Figs 5c, 6d) showed an increase in time spent over male bedding 
but a decrease in time over female bedding; (5) wild-type males always 
mounted females before males but a significant fraction of Lmx1b "~ 
or Tph2‘~ males mounted males first (Fig. 2c and Supplementary Fig. 
6c); (6) in the genital odour preference assay, both Lmx1 b (Fig. 3a) 
and Tph2-‘~ (Supplementary Fig. 5d) males showed a decrease in time 
on female genital odour, which could not be explained by hypersexu- 
ality; and (7) when presented with an oestrous target female, neither 
Lmx1b~’~ males (Supplementary Fig. 1) nor Tph2-‘~ males (Sup- 
plementary Fig. 5) were different from wild-type males. 

Increased sexual drive was observed in males lacking 5-HT when 
they were tested in the presence of live target males and females 
(Supplementary Fig. 6). This has been noted before in mice defective 
for Trpc2 and vomeronasal organ olfaction®®. Trpc2-’~ males have 
been previously reported to have lost male-female preference in mat- 
ing choice*®. Trpc2-’~ males showed increased mounting of both 
males and females (figure 2c in ref. 6). The conclusion of a loss in 
sexual preference in Trpc2 ’~ males was inferred from a relative 
change: Trpc2-’~ males showed a 2-fold preference for females over 
males whereas the wild-type showed a 10-fold preference. The pheno- 
types reported here for Lmx1b~'~, Tph2-‘~ males and pCPA-treated 
males were stronger than for Trpc2-‘~ males in mating choice: these 
males did not show significant preference for females (Fig. 2 and 
Supplementary Fig. 6). 

At present, it is not known whether 5-HT regulates the vomeronasal 
organ pathway in pheromone sensing or acts further downstream in 
behavioural decisions. Differences have been noted between Trpc2 and 
LmxIb in the brain: aggression was largely lost in Trpc2 ‘~, but not 
Lmx1b~'~, mice (data not shown). It is more likely that 5-HT regulates 
central decision-making than influencing peripheral olfaction. 
However, we cannot completely rule out the possibility that 5-HT 
regulates a specific innate olfactory pathway processing sexual 
information’’. In mice, it will be interesting to identify specific subsets 
of serotonergic neurons and serotonergic receptors involved in sexual 
preference. 

An unavoidable question raised by our findings is whether 5-HT has 
a role in sexual preference in other animals. In a positron emission 
tomography study of humans, the response of heterosexual men to the 
selective serotonin reuptake inhibitor (SSRI) fluoxetine was found to 
be different from that of homosexual men”*. SSRIs inhibited compul- 
sive sexual behaviours in homosexual and bisexual men”’. However, so 
far, none of these studies has investigated whether 5-HT has a role in 
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in Tph2 ’~ mice was significantly rescued by 5-HTP: the latency was 
lengthened and frequency reduced. d, Bedding preference was monitored 
between 35 and 40 min after injection. 5-HTP could significantly restore the 
preference of female over male bedding by Tph2 ‘~ males. 
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sexual preference. Attempts have been made to map genetic loci affect- 
ing human sexuality”, although specific genes have not been iden- 
tified. Our discovery of a role for serotonergic signalling in mouse 
sexual preference should stimulate further studies into the role of 
5-HT in sexual interactions in particular and roles of neurotransmitters 
in mammalian social relationships in general. 


METHODS SUMMARY 


We used conditional knockout mice for Lmx1b and knockout mice for Tph2. 
Levels of 5-HT in these mice and their heterozygous and wild-type littermates 
were measured by HPLC. Most of the behavioural assays were similar to estab- 
lished methods. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Mouse stocks. ePet1-Cre mice were a gift from E. S. Deneris and the floxed Lmx1b 
mice were a gift from R. Johnson. Tph2 knockout mice were generated by deleting 
exon 5, which encodes the tryptophan hydroxylase domain (for details see J.-Y.K. 
et al., manuscript submitted). Mice were weaned at the age of 21 days. Mice were 
maintained ona 12h light, 12h dark schedule and housed initially in groups of five 
up to the tenth week and then singly housed until the end of experiments. Food and 
water were provided ad libitum. Room temperature was 23 + 1 °C. Humidity was 
40-60%. All test mice were 12-16 weeks old. The target mice were 11-13 weeks old. 
Mouse genotyping. Genomic DNA was extracted from mouse tail tissues at the 
day of weaning. Mutant mice were generated by crossing ePet1-Cre mice with 
floxed Lmx1b mice and following intercross within the Fl generation mice. 
Littermates used in the tests were of the same sex and similar body weight as 
the knockout mice. The primers were: AGGCTCCATCCATTCTTCTC (floxed 
Lmx1b1); CCACAATAAGCAAGAGGCAC (floxed Lmx1b2); ATTTGCCTGCA 
TTACCGGTCG (Cre1); CAGCATTGCTGTCACTTGGTC (Cre2). 
Immunocytochemical analysis with anti-5-HT antibodies confirmed that 
5-HT-positive neurons were absent in Lmxb1 knockout mice (data not shown). 
The Tph2 line was maintained by crossing heterozygotes. Littermates included 
wild-type, heterozygotes and homozygous knockout mice. The primers for geno- 
typing were: GGGCATCTCAGGACGTAGTAG; GGGCCTGCCGATAGTAA 
CAC; GCAGCCAGTAGACGTCTCTTAC. 
Measurement of 5-HT. The levels of 5-HT and its metabolites were separated by 
HPLC and measured by an electrochemical detector in samples from adult male mice. 
In 5-HTP rescue experiments, mice were injected with 40 mg kg ' 5-HTP or saline 
(both at the volume of 5 ml kg~ 1; They were euthanized 35 min later. The brain was 
dissected and the raphe region was isolated on ice. Samples were weighed before 
ultrasonication. Monoamines were extracted by perchloric acid. The sample was 
filtrated by 0.22 tm filter before being injected into RP-HPLC (ESA). Noradrenaline, 
3,4-dihydroxyphenylacetic acid (DOPAC), dopamine, HIAA, homovanillic acid 
(HVA) and 5-HT were measured by an electrochemical detector. Their concen- 
trations were calculated by CoulArray software (ESA) based on standard samples. 
Values of amine per wet tissue weight are shown in the final figures. 
Order of behavioural assays. Male mutant mice and their littermates at 12-13 
weeks of age and of similar body weight were sexually naive and group-housed 
with same-sex mice before 10 weeks of age. After 2 weeks of single housing, mice 
were tested in the following order: bedding preference, male—male resident- 
intruder assay, mating choice assay, sexual behaviours with an oestrous female, 
bedding preference again (no difference was observed with results from the first 
bedding preference). Mice were given one week of rest between each test. For 
Lmx1b mice, the same group of mice were used in male-male mounting, mating 
choice and male-female mounting. For Tph2 mice, a different group were used for 
male-female mounting. Sexually experienced mice were used for USV, social 
approach, habituation and olfactory learning assays. Sexually naive mice were 
used for urine preference and olfactory tests. 
Resident-intruder tests. All test mice were sexually naive. The bedding of the test 
mice had not been changed for at least 4 days. Intruder mice were 11-13 weeks old, 
sexually naive and group-housed C57Bl/6J males. All activities within a test were 
recorded by an infrared camera (Sony Video Recorder, DCR-HC26C). Mounting 
latency, mounting frequency and total duration of mounting within 30 min were 
measured. 
Mating choice assay. Beddings of test mice had not been changed for at least four 
days. A group-housed sexually naive 11-13 week-old C57BI/6J male anda sexually 
naive oestrous 10-week-old female C57B1/6] female were introduced into the cage 
of each test male. Each assay lasted 15 min after the target mice were introduced. 
All activities were recorded by an infrared camera. The latency, frequency and 
duration of mounting of male or female targets were analysed. 
Sexual behaviours with females. An oestrous female was presented to a test male 
and video was recorded for 30 min using an infrared camera. The latency, fre- 
quency and duration of male mounting of the female were analysed. 
USVs. Tests were carried out with singly housed adult males during the dark phase 
in the home cage. UltraSoundGate 116-200 system (Avisoft) was used to record 
the ultrasound. We recorded the background sound for 1 min before a stimulus 
mouse of 10-13 weeks old was introduced. The recording lasted for 2 min. 
Recorded data was analysed with SASLab (Avisoft)°. Sounds over the frequency 
range of 30-110kHz were analysed. Profiles of background noise created by 
mouse movement were very different from USVs. To confirm that the resident 
mouse was the source of USVs, we recorded from assays in which either the 
resident or the intruder mouse was devocalized. We were able to record robust 
USVs (presented in our figures) only when the intruder mouse was devocalized 
and not when the resident mouse was devocalized. 
Genital odour preference assay. This assay was modified from a previously 
described procedure’’. The anogenital area scent from a male was rubbed on 


the left or right side of a clean glass microscope slide while the anogenital area 
scent from a female was rubbed on the other side of the slide. Five seconds later, the 
slide was hung in the middle of the cage by a clamp. The slides were ~5 cm over the 
bedding. Activities of the test mice were recorded for 3 min by an infrared camera 
and the sniff time on the scent portion on either side was analysed as was the 
amount of time a test male licked the slide or its nose touched the slide. 
Bedding preference assay. Bedding from group-housed adult C57BJ/6J males or 
females was not changed for 4 days. Ten grams of male or female bedding were put 
in one side on the bottom of a cage in an area of 11.5 X 17cm”. Male and female 
beddings were prevented from mixing by a plastic bar of 6 cm. The size of cage was 
29 X 17 X 15cm (length X width X height)'®. A grid of plastic bars separated the 
test mice from the bedding on the bottom of the cage. The bars were 5 mm wide 
with 5 mm intervals. The test mouse was put into the cage to be familiarized with 
the cage without bedding for 5 min before the mice were taken out and the bedding 
and a clean grid was put into the cage. After each assay, the cage was washed with 
water and then alcohol to remove odour. 

Olfactory learning assay. We employed a T maze in which electric shock could be 
applied to either side of the horizontal chamber as described previously". Briefly, 
there was a door at the intersection of the horizontal and vertical chambers. The 
horizontal chamber of 8 X 8 X 60. cm? was divided into three parts: a left arm of 
8 X 8 X 23 cm, a right arm of 8 X 8 X 23cm and a middle zone of 8 X 8 X 14cm. 
Each test mouse was introduced into the vertical chamber of the T maze. After it 
entered the horizontal chamber, the door between the vertical and horizontal 
chambers was closed and the mouse was allowed to walk within the horizontal 
chamber. The mouse was not allowed to stay in the middle zone for longer than 8 s, 
otherwise it would be punished with electroshock. The position of the test mouse 
was monitored by a video recorder. Urine samples were collected from more than 
20 C57BL/6J males or females and stored at —20 °C. A 1.5 ml urine sample was used 
for each test. The odour of male or female urine was puffed into the left or right arm 
of the horizontal chamber and expirated from the middle zone. Odour was pre- 
sented for 50 s. We trained the test male mouse with electroshock to stay in the arm 
with female odour and to avoid the arm with male odour. The mouse had to make a 
decision to stay in or leave the arm when an odour was presented. Each training 
session of 18 trials lasted for 30 min. Every mouse was given 6 training sessions over 
3 days before the final test. There were 10 trials in the final test. The percentages of 
correct choices in every training session and the final test were analysed. 

Innate behavioural responses to odours. The set-up is the same as that for the 
olfactory learning assay, except that no electroshock was applied. Sexually naive 
males (mutants or littermates) of 10-16 weeks old were tested for their choices of 
fox urine versus air, or sesame oil versus air. Fox urine was used to test the innate 
avoidance of a predator’s odour. Fox urine was diluted at two concentrations (60 
and 20X). The main air flow velocity was 2501h_ '. The air flow through fox urine 
was 70 ml min! or 210ml min", respectively. The time that mice spent in the 
empty arm or the fox urine arm was recorded by Matlab software. Sesame oil 
diluted 83 was used to test innate attraction to food. Time spent in the air arm or 
the sesame oil arm was recorded by Matlab software. 

Social approach. The social approach experiment was tested in a modified 
T-shaped box. There was a small cage separated by wire at each end of the arms 
in the horizontal chamber. A test mouse was allowed to habituate for five minutes 
before an unfamiliar target male was randomly placed in one of the small cages. 
The target mouse could be seen, smelled and heard, but could not be touched. The 
test mouse was allowed to move in the box for 5 min. Its location was video 
recorded and analysed by a computer. 

Social memory. Singly housed adult males were tested in the dark phase and in the 
room where they were reared. Ovariectomized C57B1/6J females were used as 
stimulus mice”. They were ovariectomized at 6 weeks old and used 2 weeks later. 
A stimulus mouse was introduced into the cage housing a test mouse for 1 min and 
then was removed. After an interval of 10 min, the same stimulus female was 
introduced again for 1 min. The stimulus mouse was presented four times. On 
the fifth time, a new stimulus mouse was introduced for 1 min. The behaviour of 
test mice was videotaped and time spent on body sniffing was analysed. 

5-HT depletion by pCPA treatment. Male C57B1/6J mice of 11-13 weeks of age 
were used. They were injected with either 500 mg kg‘ of pCPA (Sigma, C6506) or 
saline control for 3 consecutive days after 4 days of being singly housed. Animals 
were tested with adult C57 female mice. Mice that did not show mounting beha- 
viour in 15 min were discarded. Mice that qualified were then singly housed for 1 
week before social behaviour testing and their bedding was not changed. Animals 
were randomly divided into pCPA or saline treatment groups. pCPA was sus- 
pended in 1% Tween saline at a concentration of 50mg ml '. The pCPA group 
were injected intraperitonially with pCPA (10 mlkg ') at 72, 48 and 24h before 
testing. The control group received 1% Tween saline. Resident-intruder and mat- 
ing choice assays were carried out. Behavioural tests were performed in the dark. 
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Local, persistent activation of Rho GTPases during 
plasticity of single dendritic spines 


Hideji Murakoshi', Hong Wang' & Ryohei Yasuda!” 


The Rho family of GTPases have important roles in the morpho- 
genesis of the dendritic spines’ of neurons in the brain and syn- 
aptic plasticity*? by modulating the organization of the actin 
cytoskeleton’. Here we used two-photon fluorescence lifetime 
imaging microscopy'’* to monitor the activity of two Rho 
GTPases—RhoA and Cdc42—in single dendritic spines under- 
going structural plasticity associated with long-term potentiation 
in CA1 pyramidal neurons in cultured slices of rat hippocampus. 
When long-term volume increase was induced in a single spine using 
two-photon glutamate uncaging'*’’, RhoA and Cdc42 were rapidly 
activated in the stimulated spine. These activities decayed over about 
five minutes, and were then followed by a phase of persistent activa- 
tion lasting more than half an hour. Although active RhoA and 
Cdc42 were similarly mobile, their activity patterns were different. 
RhoA activation diffused out of the stimulated spine and spread 
over about 5 pm along the dendrite. In contrast, Cdc42 activation 
was restricted to the stimulated spine, and exhibited a steep gra- 
dient at the spine necks. Inhibition of the Rho-Rock pathway pref- 
erentially inhibited the initial spine growth, whereas the inhibition 
of the Cdc42-Pak pathway blocked the maintenance of sustained 
structural plasticity. RhoA and Cdc42 activation depended on 
Ca’*/calmodulin-dependent kinase (CaMKII). Thus, RhoA and 
Cdc42 relay transient CaMKII activation’ to synapse-specific, 
long-term signalling required for spine structural plasticity. 

Previous studies using two-photon fluorescence lifetime imaging 
microscopy (2pFLIM) and two-photon glutamate uncaging revealed 
the spatiotemporal dynamics of the signalling proteins CaMKII and 
HRas (also known as transforming protein 21) in single spines under- 
going structural plasticity and long-term potentiation'*’’. CaMKII 
activation is restricted to spines, and decays rapidly with a time con- 
stant of about ten seconds'*. In contrast, HRas activity spreads from 
the stimulated spines along dendrites and into surrounding spines over 
about 10 um (ref. 12). However, to achieve long-lasting, spine-specific 
plasticity, there should also exist signalling pathways that relay com- 
partmentalized signalling on the timescale of minutes to hours. Rho 
GTPases may constitute such signalling, because they are important in 
regulating the actin cytoskeleton*’’, which is essential for spine- 
specific, long-term structural and functional plasticity'*””. 

To measure the activation of Rho GTPases in single dendritic spines, 
we developed fluorescence resonance energy transfer (FRET)-based sen- 
sors optimized for imaging under 2pFLIM using a design similar to a 
previously developed HRas sensor’’. The RhoA/Cdc42 sensors consist 
of two components: RhoA/Cdc42 tagged with monomeric enhanced 
green fluorescent protein (mEGFP) and their binding partner, Rho 
GTPase binding domain (RBD) of Rhotekin/Pak3, doubly tagged with 
mCherry (mCherry-RBD-mCherry) (Supplementary note). When 
mEGFP-Rho GTPase is activated, mCherry-RBD-mCherry binds to 
mEGFP-RhoA/Cdc42, causing FRET between mEGFP and mCherry 
(Supplementary Figs 1 and 2). These sensors were verified to be specific 
and sensitive under 2pFLIM (Supplementary note). 

Using these sensors, we measured the activity of RhoA and Cdc42 
during spine structural plasticity associated with long-term potentiation 
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Figure 1 | Spatiotemporal dynamics of RhoA activation during long-term 
structural plasticity induced in single spines a, Fluorescence lifetime images 
of RhoA activation during spine structural plasticity induced by two-photon 
glutamate uncaging. Arrowhead indicates the stimulated spine. Warmer 
colours indicate shorter lifetimes and higher RhoA activity. Scale bar, 5 1m. 
b, Time course of RhoA activation measured as a change in the fraction of 
mEGFP-RhoA bound to mCherry-RBD-mCherry in stimulated spines (stim), 
the dendritic shaft beside the stimulated spines (dend; within 1 um), and 
adjacent spines (adj; between 3-5 um of the stimulated spines). Data using 
pharmacological inhibitors (Ctrl, control condition; KN62, CaMKII inhibitor; 
AP5, NMDA receptor inhibitor) are also shown. The inset to b shows a closer 
view of the first 4 min. The numbers of samples (spines/neurons) are 35/29 for 
stimulated spines and dendrites, 29/26 for adjacent spines, 16/10 for KN62 and 
8/5 AP5. Error bars are s.e.m. c, Transient (averaged over 16-64 s) and 
sustained (averaged over 20-38 min) RhoA activation. Stars denote statistically 
significant difference (<0.05) from the value in the stimulated spines under the 
control condition. Wilcoxon signed-rank test was used for dendrites and 
adjacent spines, and analysis of variance (ANOVA) followed by post-hoc tests 
using the least significant difference was used for experiments with 
pharmacological inhibitors. d, Averaged time course of spine volume change in 
the same experiments as in b. The inset to d shows a closer view of the first 

4 min. e, Transient (volume change averaged over 1.5-2 min subtracted by that 
over 20-38 min) and sustained volume change (volume change averaged over 
20-38 min). 
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(Figs 1, 2 and 3). Pyramidal neurons in the CA1 region of cultured 
hippocampal slices were ballistically'* transfected with the RhoA or 
Cdc42 sensor, and the FRET signal was imaged under 2pFLIM. The 
spine volume was monitored using the red fluorescence of mCherry- 
RBD-mCherry (Supplementary Fig. 3)'*. To induce structural plasticity 
in a single dendritic spine, we applied a low-frequency train of two- 
photon glutamate uncaging pulses (30 pulses at 0.5 Hz) to the spine in 
zero extracellular Mg’* (refs 13, 14 and 19). The spine volume increased 
rapidly by about 300% following glutamate uncaging (transient phase) 
and relaxed to an elevated level of 70-80% for more than 30 min (sus- 
tained phase) (Figs 1d and 2d)'*"*. The time course of spine enlarge- 
ment in neurons expressing the FRET sensor was similar to that in 
neurons expressing only EGFP (Fig. 4)"*, suggesting that the overexpres- 
sion of FRET sensors causes almost no effects on spine structural plas- 
ticity (Supplementary note). 

Under basal conditions, there was no correlation between the activity 
of Rho GTPases and spine volume (Supplementary Fig. 4). When spine 
structural plasticity was induced, both RhoA (Fig. 1a and b) and Cdc42 
(Fig. 2a and b) were activated rapidly within about 30 s in the stimulated 
spines. The activation decayed over about 5 min, followed by sustained 
activity lasting more than 30 min. RhoA activation spread into the 
dendrites over several micrometres (Figs la-c, and 3a and b), and 
invaded surrounding spines to a small extent (~25% of the stimulated 
spines). In contrast, Cdc42 activation was restricted to the stimulated 
spines (Figs 2a—c, and 3c and d). For both RhoA and Cdc42, the gra- 
dient at the spine necks was maintained for more than about 30 min 
(Figs 1b and c, and 2b and c). 

Next, we pharmacologically identified the signalling pathways that 
activate RhoA and Cdc42. Inhibition of N-methyl-D-aspartic acid 
(NMDA) receptors with 2-amino-5-phosphonopentanoic acid (AP5, 
50 uM) abolished activation of RhoA (Fig. 1b and c) and Cdc42 (Fig. 2b 


Uncagin 
4 Cde42 a 


40s 


b c 
: & 15) Peak Sustained 
® 415 15 . & 34 
e P eCtrl (stim) & 
& — |Uncaging 19} = Ctrl (dend) © 494 
5 40 5 Ctrl (adj) 5 _ + . 241 
6 eKN62 Sx KKK K 
Ss chante § APS eo5i||* d | 
gers 0.2 4 2 oe | | 
lop) 5 i 
£ e ~ : 2 ° a 0 | 
SO fhitppens Ste Ke 5 
2° oe 9S pore Gore 
0 10 20 30 40 g39z0 e420 
Time (min) geoph gop’ 
355 33S 
d Ns Sig 
a = 400 ) Transient !9°)Sustained 
& x (AV;) (AVg) 
) 
S § 200//] | 501} * 
5 S | | 
2 : x | « * | * 
5 § 
8 g 0 = 0 rol im 
0 10 2 30 40 9935 Qoa> 
Time (min) Sao 3360 
26 28. 


Figure 2 | Spatiotemporal dynamics of Cdc42 activation during long-term 
structural plasticity induced in single spines. The same experiments and 
analyses as in Fig. 1 but measuring Cdc42 activity instead of RhoA activity. The 
numbers of samples (spines/neurons) are 33/28 for stimulated spines and 
dendrite, 33/28 for adjacent spines, 11/6 for KN62 and 12/8 for APS. 
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and c) as well as spine enlargement (Figs 1d and e, and 2d and e)"*, 
indicating that RhoA and Cdc42 are activated by Ca** through NUDA 
receptors. The CaMKII inhibitor KN62 is known to strongly inhibit 
sustained spine enlargement, but has significantly less of an effect on 
transient spine enlargement (Figs 1d and e, and 2d and e)'*"*. KN62 
(10 uM) partially inhibited RhoA and Cdc42 activation during the 
transient phase, and more strongly during the sustained phase. 
(Figs 1b and c, and 2b and c). Expression of autocamtide CaMKII 
inhibitor peptide 2 (AIP2) also inhibited spine volume change and 
Rho GTPase activation in a similar manner (Supplementary Fig. 5). 
These results suggest that RhoA and Cdc42 are downstream of 
CaMKII. 

We next characterized the spatial profile of RhoA and Cdc42 
activities along dendrites as a function of the distance from the stimu- 
lated spines (Fig. 3). RhoA activity showed a relatively small gradient 
between the stimulated spines and dendrites, and spread along the 
dendrites. The length constant of spread along the dendrite was 
4.5m (Fig. 3a and b), a value similar to that for HRas (about 
10 um)". In contrast, Cdc42 activation was restricted to the stimulated 
spines (Fig. 3c and d), showing a spatial pattern similar to that of 
CaMKII*. A small fraction of Cdc42 activation spread into the 
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Figure 3 | Spatial profile of RhoA and Cdc42 activities. a, Fluorescence 
lifetime images of RhoA activity before and after glutamate uncaging. 
Arrowheads in a and c indicate the stimulated spine. b, Averaged spatial profile 
of RhoA activation. Red circles indicate the activity in the stimulated spine, and 
black circles indicate the activity in the dendrite, plotted as a function of the 
distance along the dendrite from the simulated spine. The number of samples 
(dendrites/neurons) is 20/18. c, Fluorescence lifetime images of Cdc42 activity. 
d, Averaged spatial profile of Cdc42 activation. The number of samples 
(dendrites/neurons) is 30/26. e, The fluorescence images of paGFP-RhoA (left) 
and paGFP-Cdc42 (right) after spine-head photoactivation (green, paGFP- 
Rho GTPases; red, tandem mCherry). f, Averaged timecourse of fluorescence 
decay in spines after photoactivation of paGFP-tagged proteins in the spines. 
The fluorescence intensity was normalized to the peak. The numbers of samples 
(spines/neurons) are 63/6 for paGFP, 38/4 for paGFP—HRas (G12V), 41/5 for 
paGFP-Cdc42 (WT), 83/9 for paGFP—Cdc42 (Q61L), 40/4 for paGFP- RhoA 
(WT) and 79/10 for paGFP-RhoA (Q63L). HRas (G12V), RhoA (Q63L) and 
Cdc42 (Q61L) are constitutively active mutants. g, h, Decay time constants 
(g) and the fraction remaining at 20s (h) of paGFP fluorescence in the 
photoactivated spines. Horizontal bars indicate the means. 
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dendrite and decayed sharply with a length constant of around 1.9 um 
(Fig. 3c and d). These experiments were performed at room temper- 
ature (25-27 °C), but similar results were also obtained at near- 
physiological temperature (32-34 °C; Supplementary Figs 6 and 7). 
To test whether the difference in the degree of the compartmentali- 
zation of RhoA and Cdc42 is due to a difference in their mobility’*”*, we 
measured spine-dendrite diffusion coupling using photoactivatable 
GFP (paGFP)” fused to Rho GTPases. Following photoactivation of 
paGFP ina spine, the fluorescence intensity in the spine decayed owing 
to the diffusion of paGFP-Rho GTPases out of the spine with a time 
constant of about 3 s for the wild type and about 5 s for their constitu- 
tively active mutants (Fig. 3e—h). These values are about ten times larger 
than the decay time constant of cytosolic paGFP (~0.4 s) and similar to 
that of a constitutively active HRas mutant (~5s) (Fig. 3e-h)’*. The 
difference between wild-type Rho GTPases and constitutively active 
mutants presumably reflects the difference in the fraction of the protein 
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bound to the plasma membrane, given that active Rho GTPases are 
localized on the plasma membrane’. There was only a small fraction 
(10-20%) of fluorescence remaining at 20s after photoactivation 
(Fig. 3e-h), suggesting that no major immobile fraction of RhoA or 
Cdc42 exists in spines. Thus, Cdc42 is as mobile as RhoA and HRas, yet 
only Cdc42 shows the compartmentalized activity. 

Next, to elucidate the roles of Rho GTPase activation in spine struc- 
tural plasticity, we measured the spine volume change under the 
inhibition of Rho or Cdc42 signalling (Fig. 4a-j). Downregulation of 
RhoA and RhoB with short-hairpin RNA (shRNA) decreased the 
transient volume change, but did not appreciably affect the sustained 
volume change (Fig. 4a, i and j). In contrast, shRNA against Cdc42 
decreased the sustained volume change, but not the transient volume 
change (Fig. 4e, iand j). The phenotypes caused by these shRNAs were 
rescued by co-expressing shRNA-resistant mEGFP-Rho GTPases 
(Fig. 4b, f, i and j), indicating that the effect of the shRNAs is specific 


a b c d ; 
400, ' Uncaging 400, ' Uncaging 400, , Uncaging 400, ' Uncaging 
x 
s - Ctrl -Ctrl -Ctrl 
@ 300 § -sh-RhoA/B = 300P ft -sh-RhoA/B 3007 acs 
S + mG-RhoA res 
eon n=11/9 2007 n=8/6 200; n=11/11 
2 100 100 100 
= 
O- OPS SS > e Ojey- - - eS 
> 1 1 J | 1 J Ll L J 
0 20 40 0) 20 40 0 20 40 
e ; f ; 9g 
_ 400, ' Uncaging 400; 1 Uncaging 400, ‘| Uncaging 400, ' Uncaging 
x = 
Se a -Ctrl | 4 Ctrl [ -Ctrl | Ctrl 
@ 300 ~ sh-Cde42 300; | -sh-Cdc42 300 _Wasp 300 -IPA3 
200+ +mG-Cdc42 res 200+ n=11/8 200 n=10/10 
n=8/7 
100 100 100 
Ofw4---- A 0A Sa 0 
1 J L i J 1 1 J 
0 20 40 0 20 40 0 20 40 0. 20. 40 
Time (min) Time (min) Time (min) Time (min) 
i . Transient phase j Sustained phase 
Sol kn g 150 se 
@ l @ i * . 
& 2007 | | | a] | 
2 | i | £ | 1 
® 100- yy 2 solm, | ! 
: al) 
g 0 g 0 ul ih 
TT TT TT TT TT Tit TT Lae at | Se | mT as Tt | ef rt | am 
2% OF 98 OP 98 22 2E os 2% OF 22 OP 98 22 2s Os 
= ~? ~ a “9 ~Q ~B ~w “p79 7 4 “OH ~Q “GB ~w 
=¥ coal fom xe} > > a o KR ne} 
2 s 8 go 2 8 8 ge 
< o bs 8 Ss vo bs 8 
k I 
oe a BL * — Cded42 g, 
— Spine-specific signalling = iP — RhoA 5 1 
Transient 5 — CaMkIl 5 
NMDA receptor volume change 3 — Volume 3 
N N 
ae RhoA — Rock. z 8 
camKkil —, Cdo42__» Pak—_2 Sustained 5 5 
volume change Zo ZO 
| | | | | i i j 
I I I I I 
’ ' 0 20 40 
01s 10s 1 min 10 min Th Time (min) Time (min) 


Figure 4 | The effect of Rho GTPase inhibition for structural plasticity of 
spine head enlargement. a—h, Averaged time course of spine volume change in 
stimulated spines in neurons under manipulations of Rho GTPase signalling. 
Red traces: neurons were transfected with shRNAs against RhoA and RhoB (sh- 
RhoA/B) and mEGFP (a), sh-RhoA/B, mEGFP-shRNA resistant RhoA 
(mEGFP-RhoA res) and tandem mCherry (b), mCherry—C3 transferase (C3) 
and mEGFP (c), mEGFP (d, h), shRNA against Cdc42 (sh-Cdc42) and mEGFP 
(e), sh-Cdc42, mEGFP-shRNA resistant Cdc42 (mEGFP-Cdc42 res) and 
tandem mCherry (f) or mCherry-Wasp(210-321) (Wasp) and mEGFP 

(g). Black traces: paired control experiments were performed in the same batch 
of slices using a scrambled shRNA instead of targeted shRNAs (a, e), mMEGFP 
alone (b, f) or mCherry instead of C3 transferase and Wasp 

(c, g). Pharmacology experiments (d, h) were performed before (paired control, 


black) and 30-40 min (red) after applying drugs to the bath. Fluorescence 
intensity of mEGFP (a, ¢, d, e, g, h) or tandem mCherry (b, f) was used to 
measure the spine volume change. The numbers of samples (spines/neurons) 
are indicated in the figures (same numbers for control and experiment groups). 
i, Transient volume change (volume change averaged over 1.5-2 min 
subtracted by that averaged over 20-36 min). Stars denote statistical 
significance (P < 0.05, paired t-test). j, Sustained volume change (volume 
change averaged over 20-36 min). k, A model of Cdc42 and RhoA 
activation. 1, Superimposed time courses of spine volume change and activation 
of RhoA (Fig. 1b), Cdc42 (Fig. 2b) and CaMKII’ in spines undergoing 
structural plasticity. The time courses were normalized to the peak. The right- 
hand panel shows a closer view of the first 4 min. 
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and the mEGFP-RhoA and mEGFP-Cdc42 used in the FRET sensors 
are functional as endogenous proteins. Because downregulation of 
proteins with shRNA is partial (Supplementary Fig. 8) and requires 
a relatively long time (4 days), we also inhibited Rho and Cdc42 sig- 
nalling by expressing mCherry-C3 transferase, a Rho inhibitor’’, and 
the Cdc42 binding domain of Wasp (221-321) tagged with mCherry 
(Wasp), respectively, for shorter time (24h). Rho inhibition with C3 
transferase inhibited both the transient and the sustained phases 
(Fig. 4c, i and j), showing stronger effects than shRNA (Fig. 4a, i and 
j). Cdc42 inhibition with Wasp inhibited the sustained phase but not 
the transient phase (Fig. 4g, i and j), consistent with the shRNA result 
(Fig. 4e, i and j). Thus, our data suggest that Rho signalling is required 
for the transient phase and probably the sustained phase of spine 
enlargement, whereas Cdc42 signalling is required for the sustained 
phase. Neither C3 transferase nor Wasp affected CaMKII activation 
(Supplementary Fig. 9), indicating that there is no feedback signalling 
from Rho and Cdc42 to upstream Ca”* and CaMKII. C3 transferase 
and Wasp also inhibited synaptic potentiation induced by pairing 
postsynaptic depolarization (OmV) and two-photon glutamate 
uncaging (Supplementary Fig. 10)'*"*, suggesting that Rho and 
Cdc42 are important for the functional plasticity as well as the struc- 
tural plasticity of spines. 

Among known effectors of Rho and Cdc42, Rock and Pak are two 
kinases that can be activated respectively by these GITPases**°. We 
tested whether they are required for structural plasticity through acute 
(30-40 min) application of specific pharmacological inhibitors. 
Inhibition of Rock with Glycyl-H1152 (2 1M)” suppressed both tran- 
sient and sustained volume change (Fig. 4d, iand j), similarly to the Rho 
inhibitor C3 transferase (Fig. 4c). In contrast, inhibition of Pak with 
inhibitor targeting Pak1 activation-3 (IPA3) (100 1M)’* decreased the 
sustained volume change selectively without changing the transient 
volume change (Fig. 4h, i and j), similarly to inhibition of Cdc42 sig- 
nalling (Fig. 4e and g). Taken together with the results from Rho/Cdc42 
inhibition (Fig. 4a-j), our data implies that the Rho-Rock pathway is 
required for both the transient and sustained phases, whereas the 
Cdc42-Pak pathway is required for the sustained phase of the structural 
plasticity but not for the transient phase (Fig. 4k). 

In this study, we visualized RhoA and Cdc42 activation in single 
dendritic spines undergoing structural plasticity associated with long- 
term potentiation’ '>”’. The time course of their activation was similar 
to that of the volume change: rapid activation was followed by persist- 
ent activation lasting more than 30 min (Fig. 41). As expected from its 
high mobility (Fig. 3), RhoA spread into the dendrite upon activation 
(Fig. la-c)’*. However, the activity invasion into adjacent spines was 
relatively small (25% of the stimulated spines, Fig. 1b and c) and was 
not sufficient to produce plasticity (Fig. 1d and e). In contrast with the 
diffusive pattern of RhoA activity, Cdc42 activity was restricted to the 
stimulated spines (Figs 2 and 3). The compartmentalization of Cdc42 
activity is not due to the limited diffusion of active Cdc42, because 
active Cdc42 is as mobile as RhoA and HRas (Fig. 3e-h). Given that the 
high spatial gradient of Cdc42 between the stimulated spines and 
dendrite was maintained for more than 30 min (Fig. 2b and c, and 
3d), Cdc42 must be continuously activated at the stimulated spines 
during plasticity, and inactivated immediately after diffusing out of the 
spines. The short length constant of Cdc42 in the dendrites (1.9 jim, 
Fig. 3) also supports the fast inactivation of Cdc42 in the dendrite. The 
inactivation time constant t is related to the length constant L and the 
diffusion constant D ~ 0.6 1m? (ref. 12) as follows: t = L*/D, and so t 
is about 6 s for Cdc42 and about 30 s for RhoA, compared to 200-300 s 
for HRas”. 

Our results further indicated that RhoA and Cdc42 activation is 
CaMKII-dependent, and activation lasts for more than 30 min (Figs 1 
and 2). The previous imaging study suggested that CaMKII activity 
decays with a time constant of about 10s (Fig. 4k and 1)’, thereby 
integrating NMDA-receptor-evoked Ca’ * transients that last for about 
0.1s (refs 13 and 29). Localized, persistent activation of RhoA and 
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Cdc42, which peaks between CaMKII activation and the volume change 
(Fig. 41), relays the transient CaMKII signalling’’ to long-term spine 
structural plasticity (Fig. 4k). In particular, because both CaMKII'** 
and Cdc42 exhibit spine specific activation and are required for the 
maintenance of plasticity (Fig. 4e, g, i and j), the NMDA receptor- 
CaMKII-Cdc42-Pak pathway (red line in Fig. 4k) constitutes the 
spine-specific signal transduction spanning the timescale from less than 
one second to more than half an hour to cause sustained structural and 
functional spine plasticity (Fig. 4k and 1). 


METHODS SUMMARY 


Hippocampal cultured slices were prepared from postnatal day 6-7 rats as 
described**. Neurons were sparsely transfected with Rho GTPase FRET sensors 
using ballistic gene transfer'® at days in vitro 10-14, and imaged 2-4 days after 
transfection. Rho-GTPase activity was measured using 2pFLIM (green) and spine 
volume change was monitored by measuring the fluorescence intensity of 
mCherry-RBD-mCherry (red) in spines using normal two-photon microscopy 
(Supplementary Fig. 3)'*'*. Most of the imaging experiments were performed at 
room temperature (25-27 °C). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Plasmids. Plasmids containing Dbl/p50RhoGAP, Rhotekin(8-89), C3 transfer- 
ase, RhoA/Cdc42/Pak1(65-118), and Wasp were the kind gifts of K. Hahn, G. 
Bokock, A. Aplin, M. Matsuda and S. Soderling, respectively. Pak3(60-113) was 
prepared by introducing mutations M99I and $115L into Pak1(65-118). mEGFP- 
RhoA and mEGFP-Cdc42 were prepared by inserting human RhoA and Cdc42 
coding sequences into the pEGFP-C1 vector (Clontech) containing the A206K 
monomeric mutation in EGFP*!. Because the transcription of human RhoA is 
terminated at the codon 349-353 (AATAA) in Escherichia coli, we introduced a 
silent mutation from AATAA to AACAA at the site, which does not change the 
amino acid sequence. This sequence was used for all experiments. The linker 
between mEGFP and Rho GTPases (RhoA and Cdc42) is SGLRSRG. Tandem 
mCherry”? was prepared by replacing the EGFP of the pEGFP-N3 vector with two 
mCherry. mCherry—Rhotekin(8-89)-mCherry and mCherry—Pak3(60-113)- 
mCherry were prepared by inserting Rhotekin(8-89) and Pak3(60-113) into tandem 
mCherry, respectively. The linkers between Rhotekin(8-89) and mCherry are 
SGLRSG for the amino terminus and VDVTAGPGSG for the carboxy terminus. 
The linkers between Pak3(60-113) and mCherry are SGLRSRG for the N terminus 
and GSG for the C terminus. Photoactivatable GFP (paGFP)*’-Rho GTPases were 
prepared by replacing the mEGFP of the mEGFP-Rho GTPases with paGFP 
(A206K). Mutations were introduced using a Site-Directed Mutagenesis kit 
(Stratagene). shRNAs were prepared using the pSuper vector (Oligoengine) with 
kanamycin resistance gene. The following target sequences were used for shRNA 
(5'-3'): GGGCAAGAGGATTATGACA for Cdc42 (rat and human), GAAGG 
ATCTTCGGAATGAT for RhoA (mouse, rat and human), CATCTTGGTGG 
CCAACAAA for RhoB (rat) and GTGTTGAAGTATCTGTACG for control. 
shRNA-resistant RhoA and Cdc42 were prepared by introducing three silent 
mutations in the targeted sequences. mCherry-C3 transferase (34-end) and 
mCherry-Wasp(210-321) were prepared into the pEGFP-Cl vector without 
EGFP. 

Proteins. Polyhistidine-tagged mEGFP, mCherry, super-folder GFP (sfGFP)- 
Rho GTPases*’, mCherry-RBDs and their mutants were cloned into the pRSET 
bacterial expression vector (Invitrogen). Proteins were overexpressed in E. coli 
(DH5q), purified with a Ni* -nitrilotriacetate column (HiTrap, GE Healthcare), 
and desalted with a desalting column (PD10, GE Healthcare) equilibrated with 
phosphate buffered saline. The concentration of the purified protein was measured 
by the absorbance of the fluorophore (mMEGFP, Ago nm = 56,000 cm 'M | (ref. 13); 
sfGFP, Ago nm = 83,000 cm” ' M_! (ref. 33); mCherry, Asgz nm = 72,000 cm 'M~! 
(ref. 32)) or Bradford assay. 

Preparation. Hippocampal slices were prepared from postnatal day 6-7 rats, as 
described*®, in accordance with the animal care and use guidelines of Duke 
University Medical Center. After 1 to 2 weeks in culture, CA1 pyramidal neurons 
were transfected with ballistic gene transfer* using gold beads (8-12 mg) coated with 
plasmids containing 30 tg of total complementary DNA (donor:acceptor = 1:1), 
and imaged 2-4 days after transfection. 

HeLa and Rat] cells were cultured in Dulbecco’s modified Eagle medium sup- 
plemented with 10% fetal calf serum at 37°C in 5% CO, and transfected using 
Lipofectamine (Invitrogen). 

Two-photon fluorescence lifetime imaging. Details of FRET imaging using a 
custom-built two-photon fluorescence lifetime imaging microscope have been 
described previously*°’®. mEGFP and mCherry were simultaneously excited with 
a Ti:sapphire laser (Maitai, Spectraphysics) tuned at a wavelength of 920 nm. The 
fluorescence was collected by an objective (60, numerical aperture 0.9, 
Olympus), divided with a dichroic mirror (565 nm) and detected with two sepa- 
rated photoelectron multiplier tubes placed after wavelength filters (Chroma, 
HQ510/70-2p for green and HQ620/90-2p for red). For fluorescence lifetime 
imaging in the green channel, a photoelectron multiplier tube with low transfer 
time spread (H7422-40p; Hamamatsu) was used. A wide-aperture photoelectron 
multiplier tube (R3896; Hamamatsu) was used for the red channel. Fluorescence 
lifetime images were obtained using a time-correlated single photon counting 
board (SPC-140; Becker and Hickl) controlled with custom software'!. The red 
signal was acquired using a separate data acquisition board (PCI-6110) and 
Scanimage software*’. 

Two-photon glutamate uncaging. A second Ti:sapphire laser tuned at a wave- 
length of 720nm was used to uncage 4-methoxy-7-nitroindolinyl-caged-.- 
glutamate (MNI-caged glutamate) in extracellular solution with a train of 4-6-ms, 
8-mW pulses (30 times at 0.5 Hz) near a spine of interest. Experiments were per- 
formed in Mg** fee artificial cerebral spinal fluid (127 mM NaCl, 2.5mM KCl, 
4mM CaCh, 25 mM NaHCOs, 1.25 mM NaH>PO, and 25 mM glucose) containing 
1 uM tetrodotoxin and 2 mM MNI-caged L-glutamate aerated with 95% O, and 5% 
CO), at 25-27 °C, as described previously’. In Supplementary Figs 6 and 7, neurons 
were maintained at 32-34 °C using a temperature controller (TC324B, SW-10/6 and 
SH-27B, Warner Instruments). 


Two-photon photoactivation of photoactivatable GFP. Two-photon images of 
paGFP tagged proteins were acquired every 64 ms using a Ti:sapphire laser tuned 
at a wavelength of 940 nm. For photoactivation of paGFP, the uncaging laser tuned 
at a wavelength of 800 m was used to apply a pulse of 8 mW with 6-10 ms duration 
at a spine head. To determine the decay time constant t and the immobile fraction 
fim» the paGFP fluorescence F was fitted with an exponential function, 
F(t) = Foexp(—t/t) + fin» where Fo is the fluorescence intensity at t= 0. 
2pFLIM data analyses. To obtain the mEGFP fluorescence lifetime, we summed 
over all pixels in an image of a cell expressing mEGFP-Rho GTPases, and fitted a 
fluorescence lifetime curve with a single exponential function convolved with the 
Gaussian pulse response function: 


F(t) = FoH(t, to, to, Ta) (1) 


where Fo is the constant, and 


: tj] tot 1 —tp(t—t 
A(t,l0,tp,tc) se0( is 2) at ( Gt ») (2) 
- 2p TD V2tpte 


in which Tp is the fluorescence lifetime of the free donor (mEGFP-Rho GTPase), 
Tg is the width of the Gaussian pulse response function, Fo is the peak fluorescence 
before convolution and fy is the time offset, and erf is the error function. To 
measure the fraction of donor bound to acceptor, we summed all pixels over a 
whole image and fitted a fluorescence lifetime curve with a double exponential 
function convolved with the Gaussian pulse response function: 


F(t) = Fo[PpH(t, to, to, ta) + PpH(t, to, Tap; Ta)] (3) 


where Tap is the fluorescence lifetime of donor bound with acceptor, and Pp and 
Pap are the fractions of free donor and donor bound with acceptor, respectively. 
We fixed Tp to the fluorescence lifetime obtained from free mEGFP-Rho GTpase 
(2.59ns). To generate the fluorescence lifetime image, we calculated the mean 
photon arrival time, <t>, in each pixel as**: 


<t> = |tF(t)dt/|F(t)dt 


The mean photon arrival time is then related to the mean fluorescence lifetime 
<t> by an offset arrival time t,, which is obtained by fitting the whole image”: 


<t>=<t>-f 


For small regions of interest in an image (spines or dendrites), we calculated the 
binding fraction (Pap) as™: 


Pap =tp(tp — <t>)(tp— tap) “(tp + tap — <t>) * (4) 


Overexpression level of Rho GTPase sensors in neurons. The concentration of 
mEGFP-Rho GTPase and mCherry-RBD-mCherry in neurons was estimated by 
measuring fluorescence intensity of mEGFP and mCherry in thick apical dendrites 
under two-photon microscopy relative to that of purified, polyhistidine-tagged 
mEGFP (1 tM) and mCherry (10 11M), respectively’. 

Measurements of the affinity between Rho GTPases and RBDs. Purified 
sfGFP-Rho GTPases (RhoA and Cdc42) were loaded with GppNHp (2’,3'-O- 
N-methyl anthraniloyl-GppNHp) and GDP by incubating in the presence of 
tenfold molar excess of GppNHp and GDP in MgCl)-free phosphate buffered 
saline containing 1 mM EDTA for 10 min, respectively. The reaction was termi- 
nated by adding 10 mM MgCh (ref. 40). s(GFP-Rho GTPases and mCherry-RBD 
were mixed and incubated at room temperature for 20 min. FRET between sfGFP 
and mCherry was measured under 2pFLIM, and the fraction of sfGFP-Rho 
GTPases bound to mCherry-RBD was calculated by fitting the fluorescence life- 
time curve with a double exponential function (equation (3)). The dissociation 
constant was obtained by fitting the relationship between the binding fraction and 
the concentration of mCherry-RBD with a Michaelis-Menten function (Sup- 
plementary Fig. 1). 

Estimation of endogenous Rho GTPase concentration. We determined the 
concentrations of CaMKIIa, RhoA, and Cdc42 in the CA1 region of hippocampal 
slice culture by semiquantitative western blotting (Supplementary Fig. 15). First, 
the CA1 regions from ten slices were collected and weighed. The series of purified 
CaMKIl« (4 4M, 10 uM, 20 uM, 30 uM, 40 14M), RhoA (0.1 11M, 0.2 11M, 0.5 1M, 
1.0 uM, 1.5 uM), or Cdc42 (0.1 1M, 0.2 1M, 0.5 uM, 1.0 uM, 1.5 1M) were pre- 
pared to the same weights as the CA1 tissue. The CA1 tissue and purified proteins 
were dissolved in SDS sample buffer, and analysed by western blotting. The fol- 
lowing antibodies were used: anti-CaMKII (EP1829Y; Abcam); anti-RhoA (26C4; 
Santa Cruz Biotechnology); anti-Cdc42 (BD44; BD Transduction Laboratories); 
goat anti-mouse (Zymax). Chemiluminescence signals were detected using a 
Storm image acquisition system (Molecular Dynamics), and analysed digitally 
using ImageJ software. By comparing the band intensities of the purified proteins 
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to that of lysates from the CA1 tissue, we calculated the concentration of the 
respective proteins in the CA1 tissue (Supplementary Fig. 6). The concentration 
was estimated to be 23.4uM for CaMKIIo, 0.38 1M for RhoA and 0.25 uM for 
Cdc42. The estimation of CaMKII« concentration is consistent with that obtained 
from immunofluorescence’’. 

Electrophysiology. Whole-cell patch clamping was performed with patch pipettes 
(4-9 MQ) containing Cs* internal solution (130mM CsMeSO;, 10mM Na- 
phosphocreatine, 4mM MgCl, 4mM Na-ATP, 0.4mM Naj-GTP, 10 mM Cs- 
HEPES [pH7.3])’*. Excitatory postsynaptic current evoked by two-photon 
glutamate uncaging at a spine was measured through the patch pipette using a 
patch-camp amplifier (Multiclamp 700B, Molecular Devices). Long-term poten- 
tiation was induced within 5 min of patching by pairing depolarization (0 mV) 
with two-photon glutamate uncaging at a spine (0.5 Hz, 4 ms, 30 pulses). Neurons 
showing more than 20% drift in the input or series resistances were not used for 
further analyses. 
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Crystal structure of inhibitor of kB kinase B 


Guozhou Xu'*, Yu-Chih Lo, Qiubai Li!, Gennaro Napolitano”, Xuefeng wu’, Xuliang Jiang’, Michel Dreano*, Michael Karin? 


& Hao Wu! 


Inhibitor of xB (IxB) kinase (IKK) phosphorylates IB proteins, leading to their degradation and the liberation of nuclear 
factor «B for gene transcription. Here we report the crystal structure of IKK in complex with an inhibitor, at a resolution 
of 3.6 A. The structure reveals a trimodular architecture comprising the kinase domain, a ubiquitin-like domain (ULD) 
and an elongated, «-helical scaffold /dimerization domain (SDD). Unexpectedly, the predicted leucine zipper and helix- 
loop-helix motifs do not form these structures but are part of the SDD. The ULD and SDD mediate a critical interaction 
with IxBa that restricts substrate specificity, and the ULD is also required for catalytic activity. The SDD mediates IKKB 
dimerization, but dimerization per se is not important for maintaining IKK activity and instead is required for IKKB 
activation. Other IKK family members, IKKa, TBK1 and IKK-i, may have a similar trimodular architecture and function. 


Nuclear factor kB (NF-«B) transcription factors are master regulators 
of inflammatory, immune and apoptotic responses’. In the canonical 
pathway, NF-«B dimers are held in the cytoplasm through binding to 
IB proteins, which mask their nuclear localization signals. When cells 
are stimulated by NF-«B inducers, IkB proteins are phosphorylated by 
the Ser/Thr-specific IKK, a modification that triggers their Lys-48- 
linked polyubiquitination and subsequent proteasomal degradation’. 
Freed NF-«B dimers enter the nucleus to regulate gene transcription’. 
In the non-canonical pathway, activated IKK phosphorylates the IxB- 
like domain in the NF-KB family member NF-«KB2 (p49/p100)* 
(NFKB2 in humans). NF-«B signalling pathways are associated with 
a vast number of human diseases including inflammatory disorders 
and cancer, which renders IKK a potentially important therapeutic 
target* (see, for example, http://www.nf-kb.org). 

IKK was originally purified from HeLa cells as a multiprotein com- 
plex that contains the kinase subunits IKKa (CHUK) and/or IKKB 
(IKBKB), and the regulatory protein NEMO*" (IKKy, or IKBKG). 
IKKa and IKKf both contain an amino-terminal kinase domain 
(KD), predicted leucine zipper (LZ) and helix-loop-helix (HLH) 
domains, and a carboxy-terminal NEMO-binding domain (Fig. 1a). 
IKK seems to have an additional ULD following the KD, which is not 
predicted in the corresponding region of IKKa’. The IKK-related 
kinases TBK1 (NAK) and IKK-i (IKK-¢, or IKBKE) seem to have a 
similar domain organization’. Whereas IKKB mediates activation of 
the canonical NF-«B pathway in response to pro-inflammatory stim- 
uli, IKKo has an indispensible role in non-canonical NF-«B signalling 
by phosphorylating NF-«B2 (p49/p100). Protein kinase assays sug- 
gest that IKKB accounts for nearly all of the catalytic activity of the 
IKK holoenzyme towards IKBa*? (NFKBIA). 

The activation loop in both the IKKa and the IKKB KD contains 
the mitogen-activated protein (MAP)-kinase kinase consensus motif 
SXXXS, where X is any amino acid® *”° (Ser 177 and Ser 181 in human 
IKKB). Some MAP-kinase kinase kinase family members, such as 
TGF-B-activated kinase 1 (MAP3K7, previously TAK1) and NF- 
«B-inducing kinase (MAP3K14, or NIK), were shown to phosphor- 
ylate IKKs*'*"*. IKK and IKKf can also undergo autophosphoryla- 
tion and activation as a result of overexpression or signal-dependent 
NEMO clustering’*’®. Ala substitutions of the activation-loop Ser 
residues prevent IKK activation whereas the phosphomimetic, double 


Glu mutation $177E/S181E (EE) of IKK renders it constitutively 
active”? 


Trimodular arrangement of IKKf 

We determined the crystal structure of Xenopus laevis IKKf (ikbkb) 
EE (residues 4-675; Fig. la) in complex with either inhibitor 
Cmpd1 or inhibitor Cmpd2 (Supplementary Fig. 1) at resolutions 
of 4.0 and 3.6 A in the 14,22 and P1 space groups, respectively (Sup- 
plementary Tables 1 and 2 and Supplementary Fig. 2). Eight IKKB 
molecules in P1 and the single molecule in [4,22 are highly similar to 
each other (Supplementary Fig. 3 and Supplementary Table 3) and 
show conserved dimerization (see below). Our structural descrip- 
tion will use the first dimer (chains A and B) in P1. The sequences of 
human and Xenopus IKK (henceforth hIKKB and xIKKf, respec- 
tively) have 74% identity with no gaps within the region of the 
structure; residue numbers designated for xIKKf are also true for 
hIKKB. 

The IKKf dimer structure resembles a pair of shears and has the 
overall dimensions of approximately 100 A x 130 A x 60 A (Fig. 1b, 
c). It contains the KD (residues 16-307), the ULD (310-394) and the 
SDD (410-666) (Fig. 1a and Supplementary Fig. 4). The KD and the 
ULD form the ‘handle’ of the shears, and the SDD is the ‘blade’. Both 
the KD and the ULD intimately associate with the SDD, suggesting 
that the KD does not function independently. Instead, the structure 
indicates that IKK is an integral trimodular unit. 

The IKKB KD in complex with either Cmpd1 or Cmpd2 has the 
typical bilobal kinase fold’’. Although it has only 21.1% sequence 
identity with human ubiquitin, the ULD of IKKf indeed has the 
ubiquitin fold (Fig. 1d). A major difference is the presence of an 
eight-residue insertion (373-380) that forms part of the loop connect- 
ing B-strands B4’ and 85’ in the ULD, which is mostly disordered. 
The hydrophobic surface patch of ubiquitin’’, which is often recog- 
nized by ubiquitin-binding proteins, is conserved in the ULD, with 
residues Val 318, Leu 354 and Leu 389 equivalent to Leu 8, Ile 44 and 
Val 70 of ubiquitin (Supplementary Fig. 5). 

The SDD comprises six o-helices (a1s—a6s), of which «2s and o6s 
contain 70 and 77 residues, respectively. They twist together with a 
stretch of three shorter helices between them to fold as an elongated 
structural ensemble. The C lobe of the KD sits on the N-terminal end 
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of the SDD, and the ULD binds close to the middle of the SDD. 
Unexpectedly, formerly designated LZ (458-485) and HLH domains’*”° 
(605-644) do not exist as such in the structure and are both part of the 
SDD (Fig. le). Most of the hydrophobic residues in the predicted LZ 
point inwards and are not available for mediating dimerization as previ- 
ously proposed. 


Structure of inhibitor-bound IKKB KD 

The inhibitor binds to the IKKB KD at the hinge loop connecting the N 
and C lobes, a region that recognizes the adenine in ATP’”” and is often 
used for inhibitor binding*’” (Fig. 2a and Supplementary Fig. 6). The 
KD conformation is incompatible with that of an active kinase'”**’° 
(Fig. 2b, c). The activation segment, which begins from Asp 166-Leu 
167-Gly 168 as the conserved DFG triad and extends to Ala 190- 
Pro 191-Glu 192 in the conserved APE motif, is fully ordered, including 


a b 16 c 


~~ 


N lobe 


Figure 1 | Structure of xIKKB. 

a, Linear representation of IKKB 
showing the boundaries for the KD, 
the ULD and the SDD. Sequences of 
hIKKB and xIKKf are of 756 and 758 
residues, respectively, and differ only 
at the most C-terminal region. The 
crystallized xIKKB construct is 
shown. The previously designated LZ 
and HLH regions are shown in 
parentheses. b, Ribbon diagram of an 
xIKKf protomer in the P1 crystal 
form. The N and C termini, KD N 
lobe (orange) and C lobe (yellow), 
ULD (magenta) and SDD (blue) are 
labelled. Secondary structural 
elements are labelled, with those in 
the ULD indicated with a prime and 
those in the SDD indicated with an 
‘s’. c, Ribbon diagram of an xIKKB 
dimer. d, Superposition of the ULD 
(magenta) with ubiquitin (grey). 

e, Ribbon diagram of an SDD dimer, 
showing locations of the previously 
designated LZ (red) and HLH 
(orange) regions. 


phosphomimetic residues Glu 177 and Glu 181 (Fig. 2b). However, the 
C-terminal anchor of the activation segment, including the APE motif, 
is completely out of place, with the result that essential interactions with 
the catalytic centre are lost (Fig. 2c). The gross conformation of the 
N-terminal anchor of the activation segment is preserved, with B9 
paired with the 6 strand that precedes the catalytic loop. Part of the 
activation loop (175-177) contains an additional B-strand (B10), which 
sits between the lobes to form a three-stranded B-sheet with B9 and B6. 
The aC helix is tilted up and outwards (Fig. 2c) to make room for B10, 
which also disrupts its productive interactions with the DFG motif in 
active kinase structures. 


Interactions among the KD, ULD and SDD 


The KD, ULD and SDD interact mutually, with the ULD-SDD inter- 
action being the most extensive, followed by the KD-SDD and KD-ULD 


Figure 2 | Inhibitor-bound xIKKB 
kinase domain. a, F, — F, electron 
density map for Cmpd1 in the [4,22 
structure, contoured at 2.0c. Carbon, 
nitrogen and oxygen atoms are 
shown in green, blue and red, 
respectively. The four ring structures 
in Cmpd1 are labelled A, B, C and D, 


Activation 


segment of 
IKKB 
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respectively. b, Structure of the 
xIKKB KD. Gly-rich loop, cyan; 
activation segment, red (except that 
the DLG and APE sequences are in 
black); Cmpd1, purple. Side chains of 
phosphomimetic residues Glu 177 
and Glu 181 are shown. 

c, Superposition of xIKKB (orange 
and yellow) and protein kinase A 
(PKA, cyan; Protein Data Bank ID, 
1ATP). The activation segments of 
xIKKf and PKA are shown in red 
and black, respectively. 


interactions. The ULD binds close to the middle of the SDD, contacting 
helices «2s and «6s (Fig. 3a). The interaction buries ~700 A? of surface 
area per partner. There are substantial hydrophobic contributions, 
supplemented by electrostatic interactions. Residues Met315, 
Met 317, Leu 354, Ile387, Leu389 and Phe 390 on one side of the 
ULD pack against Leu 612, Tyr 609, Leu 447 and the main chain of 
a2s of the SDD to form the central hydrophobic core of the interface. 
This hydrophobic patch of the ULD is immediately adjacent to and 
overlaps the conserved hydrophobic patch in the ubiquitin fold. 
Electrostatic interactions are observed between Glu 352 and Lys 619 
and between Lys394 and Asp 610. Additional interfacial residues 
include Ser 319 and Ser 357 of the ULD and Thr 453, Gln 449 and 
Val 616 of the SDD. 

Consistent with an important role of the ULD in interacting with the 
SDD, mutations in residues that are not at the interface, P347A, L361A 
and Q351A, had minimal effects on NF-«B activation’®. In contrast, 
mutation in a residue within a ULD surface loop that contacts the SDD, 
G358A (Fig. 3a), affected IKKB-induced NF-KB activity’®. It was also 
shown that Leu 353 is required for IKKf activity”®; however, Leu 353 is 
buried in the ULD core and the L353A mutation may have disrupted 
IKK structural integrity. Double substitutions of hIKKa and hIKKB, 
which are equivalent to 1608R/Y609P of the SDD of xIKKf, did not 
affect their interaction with wild-type IKKB but negatively impacted 
kinase activity'®; Ile 608 is buried in the SDD core and Tyr 609 directly 
interacts with the ULD (Fig. 3a). 

Like the ULD, the KD also makes contact with the N terminal end 
of the SDD (Fig. 3b), burying a surface area of ~650 A? of each 
interface. The binding forces are mainly van der Waals in nature. 
Limited hydrophobic interactions are observed between residues 
Ala 252 and Val 253 of the KD and Tyr 423 of the SDD, and between 
Phe 111 of the KD and the hydrophobic portions of Arg 572, Arg 575 
and Glu 576 of «5s of the SDD. The KD is linked to the ULD through a 
two-amino-acid linker between aI of the KD and £1’ of the ULD 
(Fig. 3c), burying only ~350 A? of surface area. Side-chain contacts 
between Leu 311 of the ULD and Ile 268 of the KD are observed, and 
together form the small hydrophobic patch at the KD-ULD junction 
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consisting also of Leu 269 and Ile 306 of the KD and Leu 309 of the 
linker. Together with the ionic interaction between Asp 373 of the 
ULD and Arg 123 of «&E of the KD, these interactions may confer 
rigidity to the KD-ULD junction despite the small buried surface area. 

Structural comparison with other kinase structures revealed a 
similarity between the locations of the SDD and ULD and those of 
several known docking sites for substrates and regulatory proteins”. 
In the crystal structure of the Ser/Thr kinase SRPK1 in complex with a 
docking motif in its substrate, ASF/SF2”’ (SRSF1), the peptide motif 
interacts with the kinase at the location of the SDD (Fig. 3d). In the 
structure of the TAK1 KD fused with the C-terminal region of its 
binding protein, TAB1**, TAB] interacts with the C lobe of the kinase 
at a position analogous to both the SDD and the ULD, presumably to 
activate the kinase (Fig. 3d). 


ULD-SDD binds the IkBa C-terminal region 


Previous studies have suggested that the ULD in TBK1 and IKK-i is 
involved in substrate recognition because its deletion impaired activity 
of the respective kinases and a glutathione S-transferase (GST)-ULD 
fusion protein interacted with the specific substrate, IRF3 or IRF7”’. 
Because ULD deletion in IKKB also abolished its activity*®, we pro- 
posed that the ULD may recognize its specific substrate, IkBa. 
However, we were surprised to find that GST-IkBa (1-317 and 54- 
317) pulled down only a minute amount of the ULD (Fig. 4a, lanes 9 
and 13) and GST alone did not pull down any (Fig. 4a, lane 4), suggest- 
ing that the interaction of IxkBa with the ULD is specific, but very weak. 
In contrast, GST-IkBa robustly pulled down full-length IKK or IKKB 
lacking C-terminal NEMO-binding domain (Fig. 4a, lanes 15 and 16). 

IkBo has an N-terminal region (1-54) that contains cognate phos- 
phoacceptor sites at Ser 32 and S 36, followed by a C-terminal region 
(55-317) that contains multiple ankyrin repeats and the PEST 
region’. Strikingly, the N-terminal region of IkBa did not pull 
down any IKK constructs (Fig. 4a, lanes 5-8), and the C-terminal 
region of IxBo interacted robustly with full-length IKK as well as its 
ULD-SDD region (Fig. 4a, lanes 9-12), and very weakly with the ULD 
alone (Fig. 4a, lane 9). Further mapping on IkBa showed that both 


Figure 3 | Interactions among the 
KD, ULD and SDD. a, Interaction 
between the ULD (magenta) and 
SDD (blue). Important interfacial 
side chains are shown with nitrogen 
atoms in blue, oxygen atoms in red, 
sulphur atoms in orange and carbon 
atoms in either pink (ULD) or light 
blue (SDD). Gly 358 is marked with a 
black ball on the main chain. 

b, Interaction between the KD 
(yellow) and SDD (blue). 

c, Interaction between the KD 
(yellow) and ULD (magenta). 

d, Locations of the TAB1 peptide 
(green ribbon; PDB ID, 2EVA) and 
the ASF/SF2 peptide (purple stick 
model; PDB ID, 1WBP) relative to 
the IKK structure after 
superposition of the corresponding 
kinase domains. 
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ankyrin repeats (1-282 or 54-282) and the PEST region (282-317) 
interacted with IKKfB ULD-SDD (Fig. 4b, lanes 4-6). The extent of 
pull-down suggests that the PEST region contributes more to IKKB 
interaction than do the ankyrin repeats. Despite trying multiple con- 
structs, we could not obtain soluble protein for IKKB SDD to test its 
interaction with IkBo.. However, the pull-down data suggest a domi- 
nant role for SDD in IxBu interaction. In any case, it is clear that the 
mutual interaction between IKKB and IkBa is mediated by their 
ULD-SDD and C-terminal domains, respectively. 


ULD-SDD limits specificity and the ULD aids catalysis 


The specific interaction between ULD-SDD of IKK and IkBa sug- 
gests that the Michaelis constant, Ky, of phosphorylation by IKKB 
might be lower for full-length IkBo than for its N-terminal region 
(1-54) alone. We performed kinase kinetics analysis of IKKB EE 
against the two different substrates. Unexpectedly, the measured 
apparent K,, values were 11.4 and 13.7 uM for full-length IkBa and 
the N-terminal region alone, respectively (Fig. 4c, d), suggesting that 
binding at the SDD, an exosite, does not alter the K,, of the reaction 
drastically. This could be due to opposing effects of the SDD-IkBa 
interaction, which increases substrate interaction but slows down 
product dissociation. The relative maximum velocity, Vax, Values 
were 502.4 and 250.2, respectively, suggesting that the SDD-IkBa 
interaction moderately enhances catalysis. 

Casein kinase 2 phosphorylates the C-terminal PEST region of 
IkBo around residues 283-299°°*'. To determine whether the SDD- 
IxBa interaction restricts substrate specificity, we compared the 
kinase activity of purified IKKB EE proteins against either IkBo or 
its S32A/S36A mutant (AA) using [y---P] ATP (Fig. 4e). Although the 
KD-ULD (1-400) construct gave rise to a small amount of protein, it 
produced robust phosphorylation of wild-type IkBa, comparable to 
full-length IKKB (1-756), suggesting that it is catalytically competent. 
Remarkably, KD-ULD effectively phosphorylated the AA mutant, 
which, in contrast, was a poor substrate for full-length IKKB. The 
C-terminal PEST region seemed to harbour the major sites of phos- 
phorylation by KD-ULD because a construct lacking PEST (1-282) 
was not significantly phosphorylated by KD-ULD but was phos- 
phorylated by full-length IKKB (Fig. 4e). In addition, when IkBa 
phosphorylation was detected by antibody against IxBa phosphory- 
lated at Ser 32/Ser 36, only very weak phosphorylation was seen for 
the KD-ULD construct in comparison with full-length IKK (Fig. 4f). 
These experiments suggest that although the KD-ULD construct is 
active, it possesses an altered substrate specificity causing preferential 
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phosphorylation of the C-terminal PEST sequence, consistent with a 
previous observation’. Hence, ULD-SDD seems to position IkBa 
specifically such that only its N-terminal region is accessible to the 
IKKf catalytic pocket (Fig. 4g). 

We expressed three kinase domain constructs, 1-301, 1-310 and 
1-360, in insect cells and obtained only small amount of protein for 
the 1-360 construct. Kinase assay showed that IKK EE (1-360) had 
very low residual activity against IkBo or its AA mutant (Fig. 4e), 
suggesting that KD activity is compromised in the absence of the 
ULD. Without an isolated KD structure, we cannot deduce the 
molecular mechanism by which the ULD acts. However, in analogy 
to TAK] activation by TAB1”* (Fig. 3d), it may be speculated that this 
KD-ULD interaction allosterically potentiates kinase activity. Further 
kinase assay using antibody against IkBa phosphorylated at Ser 32/ 
Ser 36 showed no detectable activity of the KD alone (Fig. 4f), con- 
firming that the low residual activity may also be directed towards the 
C-terminal PEST region, similar to KD-ULD. Therefore, whereas 
ULD-SDD is critical for IKKB specificity, ULD is required for its 
catalytic activity. 


SDD mediates IKK dimerization 

Full-length hIKKB (1-756) and its KD-ULD-SDD region (1-678) are 
dimers in solution as determined by gel-filtration chromatography 
(Fig. 5a). In the P1 and 14,22 crystals, two conserved dimerization 
interfaces exist, one mediated by the SDD and the other mediated by 
the KD. Because hIKKf (1-643) that truncates into the SDD is a 
monomer in solution (Fig. 5a), SDD-mediated dimerization 
(Fig. 5b) is probably of greater importance than KD-mediated dimer- 
ization. 

SDD in an IKKf dimer does not form extensive interactions along 
the entire length dimension of the helical bundle. Rather, interactions 
are mostly localized at the end of the bundle (Fig. 5b), distal from the 
KD and ULD and burying ~1,000 A? of surface area of each mono- 
mer. The interaction is mostly hydrophobic. Residues that contribute 
significantly to dimerization include Gln 478, Lys 482, Phe 485, 
Tle 492, Lys 496, Ile505, Gln651, Leu654, Trp655, Leu658 and 
Tle 660, with residues Leu 654, Trp 655 and Leu 658 from helix «6s 
burying the most surface area (Fig. 5b). This dimerization interface is 
entirely unexpected as it was predicted earlier that the LZ mediates 
IKKB dimerization. The structure now explains that failed dimeriza- 
tion of the LZ-defective L462S/L469S mutant of IKKa"® is due to the 
structural role of Leu 469, whose equivalent residue of IKKB, Met 472, 
is buried in the SDD core. Superposition of four IKKB dimers in P1 
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and the single IKK® dimer in [4,22 shows that IKKB dimers are 
conserved (Supplementary Fig. 7). In [4,22, the distal part of the 
SDD is not visible, owing to a lack of crystal packing in this region 
and dynamic disorder, not degradation (Supplementary Fig. 8). 

To confirm the observed interface in IKKf dimerization, we per- 
formed structure-based mutagenesis on residues that bury the most 
surface area, Leu 654, Trp 655 and Leu 658. These residues are not 
within the predicted LZ region. Two purified double mutants, L654D/ 
W655D and W655D/L658D, failed to dimerize, as shown by the 
considerable shift in gel filtration positions (Fig. 5c). Furthermore, 
equilibrium analytical ultracentrifugation experiments showed that 
wild-type IKK is indeed a dimer whereas one of the IKK mutants is 
a monomer and the other has a 167-fold weaker dimerization affinity 
(Fig. 5c). 


Dimerization in IKK activation but not activity 


The geometry of the IKKB dimer with KDs facing away from each 
other suggests that each IKKB protomer is independent in its kinase 
activity. To confirm this, we transfected HEK293T cells with hIKKB 
EE mutants, L654D/W655D (654-655), W655D/L658D (655-658) 
and L654D/W655D/L658D (654-655-658), and found that all 
mutants had robust kinase activity (Fig. 5d). Dimerization mutants 
expressed in insect cell and purified showed the same results. In 
addition, a purified IKKB construct with truncation into the SDD 
(1-643) that is monomeric in solution (Fig. 5a) was active in phos- 
phorylating IkBa (Fig. 5d). 

Previous studies have shown that IKKB can undergo trans- 
autophosphorylation and activation on transfection’*'*. This autopho- 
sphorylation and phosphorylation by TAK1 probably both contribute 
to IKK activation on cell stimulation. To determine whether the 
observed dimerization interface is critical for this activation, we tested 
dimerization mutations in IKKf activation on overexpression in 
HEK293T cells (Fig. 5e). Whereas wild-type IKKB was robustly acti- 
vated, the L654D/W655D and W655D/L658D mutants completely and 
partly lost this activation, respectively, in a manner that correlates with 
the degree of impairment in dimerization (Fig. 5c, e). Overexpression of 
IKKB in NEMO-deficient mouse embryonic fibroblasts resulted in its 
activation, but to a lesser extent than in wild-type mouse embryonic 
fibroblasts (Fig. 5f). We found that IKKf dimerization mutants failed to 
interact with NEMO efficiently (Fig. 5g), perhaps owing to reduced 


loop 


avidity for oligomeric NEMO. Therefore, although IKKf kinase activity 
does not depend on dimerization once its activation loop is phosphory- 
lated, IKK activation requires dimerization and is probably enhanced 
by interaction with NEMO. 


Discussion 


The IKK structure presented here predicts that IKKa, TBK1 and 
IKK-i all have an overall structural organization that comprises KD, 
ULD and SDD. Although a ULD was not predicted in IKKa, conser- 
vation of buried residues in this region and of the ULD-SDD interface 
suggests that IKKa also has this domain (Supplementary Fig. 4). The 
ULD and SDD probably have the same structural role but may have 
evolved additional, differential functions in the individual kinases. 
SDD-mediated dimerization may also be conserved. In particular, 
residues at the observed IKK dimerization interface are highly con- 
served in IKKa (Supplementary Fig. 4), explaining how IKKa and 
IKK can form both homo- and heterodimers*"’. Given our structure 
of IKKB and the previously determined structures of NEMO frag- 
ments**~*’, we may speculate that the full IKK complex is also a dimer, 
or a dimer of dimers with a molecular mass of around 270 or 540 kD. 
The apparent 700-900-kD molecular mass of the IKK holoenzyme on 
gel filtration® © may be due to the elongated nature of NEMO and the 
complex (Supplementary Fig. 9). 

Because the conserved IKK dimer structure does not place the 
KDs close to each other for trans-autophosphorylation, we wondered 
whether higher-order oligomerization, which is probably enhanced 
by NEMO and its ability to bind ubiquitin****, is responsible for this 
autoactivation. In both P1 and 14,22, IKKf exists as dimers of dimers 
(Fig. 5h and Supplementary Fig. 10). In particular, active sites of two 
neighbouring protomers in the tetramer face each other such that an 
activation loop from one protomer might reach into the active site of 
the other (Fig. 5i), which may mimic an autophosphorylation state. 

The ULD-SDD region of IKKf directly interacts with the 
C-terminal portion of IxBa. This interaction may serve several pur- 
poses. First, it probably orients IxkB« such that its N-terminal cognate 
phosphorylation sites are presented to the KD active site (Fig. 4g). 
Without this interaction, IKKB preferentially phosphorylates the 
C-terminal PEST motif of IxBa. Second, the interaction seems to 
enhance IKKf activity. Third, phosphorylation at the PEST by casein 
kinase 2 or other kinases may regulate IxBo interaction with IKKB 
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and hence affect phosphorylation at the cognate sites. Fourth, the 
interaction may allow concerted phosphorylation at both Ser 32 and 
Ser 36 of IkBa without intervening dissociation. In MAP-kinase cas- 
cades that involve dual phosphorylations, specific docking interac- 
tions occur between the kinases and their substrates’”**”’. The 
B-catenin protein in the Wnt signalling pathway contains the same 
dual-phosphorylated destruction box motif as IkBa (ref. 40). 
Consistent with this, B-catenin is brought to the responsible kinase, 
GSK-3, by means of the adaptor protein axin, allowing both specificity 
and concerted phosphorylation**. Therefore, in a general statement 
that structure serves function, the IKKB structure seems to fit its 
function as the IkBa kinase perfectly, being poised to turn on the 
important NF-«B pathway specifically, efficiently and concertedly 
in response to cellular physiology. 


METHODS SUMMARY 


Xenopus laevis IKKB was expressed in insect cells and purified to homogeneity 
using nickel affinity chromatography, ion exchange and gel filtration chromato- 
graphy. We crystallized the P1 and [4,22 forms at 4 °C in polyethylene glycol 6000 
and potassium/sodium phosphate, respectively. The structure was determined by 
multiwavelength anomalous diffraction of the selenomethionyl protein. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Protein expression and purification. To elucidate the molecular basis of IKKB 
function, we expressed IKKB from a number of species using baculovirus-mediated 
insect cell expression. The xIKKB sequence in the NCBI database starts at a Met 
residue that is equivalent to Met 17 of both the hIKKf and the mouse IKKf (mIKKf) 
sequence. Translation of the DNA sequence preceding the ATG codon of Met 17 
revealed sequences that are almost identical to residues 9-16 of hIKKB and mIKK®. 
These were taken as part of the xIKK sequence and residues 1-8 were taken from the 
corresponding mIKKf sequence. This reconstructed xIKK sequence has the same 
residue numbering as the hIKKf sequence until after the SDD. 

Various constructs of IKKB wild type and the phosphomimetic $177E/S181E 
mutant were designed with an N-terminal polyhistidine tag and a tobacco etch 
virus protease-cutting site between the tag and the protein. Recombinant IKKB 
baculoviruses were made in DH10BAC cells, amplified and used to infect Hi5 
insect cells in serum-free media (Invitrogen). The cells were cultured in suspen- 
sion and harvested 48 h after infection. The recombinant proteins were purified 
by nickel affinity chromatography, anion exchange and gel filtration chromato- 
graphy. For crystallization, the polyhistidine tag was cleaved by the tobacco etch 
virus protease during protein purification. 

All IxBa proteins were expressed in Escherichia coli using pET28a, pGEX4T3 
and pET-SUMO vectors and purified by their respective affinity tags. For kinase 
assays, the SUMO tag was cleaved from IxBa proteins. His-ULD and His-ULD- 
SDD of hIKK were also expressed in E. coli using the pET28a vector. 
Crystallization and data collection. Unlike many protein kinases, the IKKB KD 
cannot be recombinantly expressed as a well-behaved biochemical entity for structural 
studies. In addition, after mapping a compact region by limited proteolysis, IKK was 
still refractory to crystallization, both alone and in the presence of various ATP 
analogues. To overcome this obstacle, we used several IKKf inhibitors, including 
Cmpd1 and Cmpd2 (Supplementary Fig. 1), which were identified against the 
$177E/S181E (EE) mutant, in co-crystallization. A hIKKf EE construct (1-678) lack- 
ing only the C-terminal NBD did crystallize; however, these crystals only diffracted 
toa resolution of ~7.5 A. Searching IKKB orthologues that may give better crystals 
led to success in crystallizing the analogous region of xIKKB EE (4-675; Fig. 1a). 

The xIKKf (S177E/S181E) protein containing residues 4-675 was concen- 

trated by ultrafiltration (Amicon) to about 15mg ml | in 20mM Tris-HCl 
(pH 8.0), 150 mM NaCl and 10 mM dithiothreitol (DTT). It was mixed with an 
inhibitor compound in a 1:2 molar ratio before crystallization. Cmpd1 is 4-((4- 
(4-(chlorophenyl)pyrimidin-2-yl)amino)phenyl)(4-(2-hydroxyethyl)piperazin- 
1-yl)methanone and Cmpd2 is 1-(4-(4-((4-(4-(pyridin-4-ylsulfonyl)phenyl)- 
pyrimidin-2-yl)amino)benzoyl)piperazin-1-yl)ethanone. The P1 crystals were 
grown using hanging-drop vapour diffusion at 4°C by mixing equal volumes of 
the purified protein and the crystallization condition of 100 mM N-(2-acetamido) 
iminodiacetic acid at pH 6.5, 10% (w/v) polyethylene glycol 6000, 50 mM Li,SO,, 
300 mM NaCl and 10 mM DTT. The J4;22 crystals were grown at 4°C with well 
solution containing 1.8 M K/Na phosphate at pH 5.6 and 10 mM DTT. For data 
collection, all crystals were flash frozen in the respective crystallization conditions 
supplemented with 25% (v/v) ethylene glycol. Diffraction data were collected at the 
24ID-C beam line of the Advanced Photon Source. Multiwavelength anomalous 
diffraction (MAD) data on heavy-atom derivative crystals or selenomethionyl 
crystals were collected near the respective absorption edges. All diffraction data 
were processed using the HKL2000 suite’? and their statistics are shown in 
Supplementary Table 1 and Supplementary Table 2. 
Structure determination, refinement and analysis. The initial xIKKB crystals 
grew in the Pl space group in the presence of the inhibitor Cmpd1 or Cmpd2 and 
diffracted toa resolution of 3.6 A. Selenomethionyl crystals were obtained, but we failed 
to locate the large number of expected selenium sites. Among the extensive heavy-atom 
searches, an ytterbium-derivative was obtained, with eight well-defined sites, which 
probably correspond to eight IKKB molecules in the asymmetric unit. However, the 
electron density map calculated from a three-wavelength ytterbium-anomalous dif- 
fraction data set was insufficient for tracing, and phase combination with the seleno- 
methionyl data set could not be performed, owing to non-isomorphism. 

The structure determination was eventually successful in the alternative crystal 
form, [4,22, which contains one molecule of IKKB in complex with Cmpd1 and 
diffracted to a resolution of 4.0 A, using MAD of the selenomethionyl crystals (Sup- 
plementary Tables 1 and 2 and Supplementary Fig. 2). Twelve selenium sites were 
determined using the program SHELXD“ and refined with the program 
MLPHARE in the CCP4 suite’. MAD phases were calculated at a resolution of 
4.0 A with data from 14,22 crystals using the program SHARP". A few cycles of 
model building and refinement were carried out with the program O* and 
REFMAC with TLS parameterization”’. The [4,22 crystals contain one monomer 
in the asymmetric unit and 80% solvent when calculated with the entire IKKB 
construct and 84% solvent when considering only the ordered part of the structure. 
The inhibitor Cmpd1 has density in the MAD experimental map and the F, — F. 
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omit map. The Dundee PRODRG2 Server was used to generate topology and 
restraint files of the compound for refinement. The refined model contains residues 
16-236, 243-286, 290-376, 384-394, 401-475 and 528-637 and Cmpd1. 

The structure of the P1 form was determined by molecular replacement using 
the refined model of the [4,22 crystal form as the search model, in which eight 
molecules were located. Selenium sites of the single-wavelength anomalous dif- 
fraction data set of a selenomethionyl crystal in the P1 space group was calculated 
by difference Fourier analysis and used to verify residue registration in the P1 
structure. Refinement in the P1 structure was conducted at a resolution of 3.6 A 
and incorporated tight non-crystallographic symmetry restraints (root mean 
squared deviation of 0.02 A in atom positions). After several rounds of refinement 
at a resolution of 3.6 A, new electron densities appeared in the P1 crystal form to 
complete the model building. Although Cmpd2 was in the crystallization con- 
dition, it did not have clear density and was not included in the refinement. The 
refined model of the P1 crystal form contains four IKKf dimers in the asymmetric 
unit. Three of the dimers encompass residues 16-236, 243-286, 290-376, 384- 
394, 401-551 and 559-666. One dimer contains the same residues as the structure 
in [4,22. The structures were analysed using the CCP4 suite* and the Dali 
server”, and the figures were made using PYMOL”. 

GST pull-down. The tagged proteins were first purified with glutathione or Ni- 
NTA beads and their expression levels were assessed by SDS—polyacrylamide gel 
electrophoresis (SDS-PAGE). Beads containing estimated equivalent quantities of 
the tagged proteins were mixed with the cell lysates or the purified versions of the 
interaction partners. The mixtures were incubated at room temperature (20 °C) for 
1h with rotation. After centrifugation, the supernatants were removed. The beads 
were then washed twice, eluted and subjected to SDS-PAGE analysis. All pull- 
down experiments were repeated two to four times with consistency. 

Kinase assays using anti-phospho-IkBa, antibody. The hIKKf proteins (0.1 pg 
ul) were incubated with recombinant IkBo. (1 lig ul?) in 50 mM Tris-HCl at pH 
8.0, 100 mM NaCl, 10 mM MgCl, and 2mM DTT for 30 min at 30 °C. SDS-PAGE 
sample buffer was used to terminate the reactions. The products were separated 
on 15% SDS-PAGE and transferred to PVDF membranes. Anti-phospho-IkBa 
antibody (Cell Signaling Technology) was used to detect phospho-IkBu. 
Determination of K,, of hIKKf for IkBa (1-54) and full-length IkBa. To derive 
the K,, of hIKK® for full-length IxBa, kinase assays were performed at substrate 
concentrations of 2.8, 5.3, 10.6, 21.3, 42.5 and 85 tM. Similarly, to derive the K,, of 
hIKK® for IkBox (1-54), kinase assays were performed at substrate concentrations of 
5.3, 10.6, 21.3, 42.5, 85 and 170 1M. A time course of the kinase reactions was first 
performed to select a hIKKB amount and a time point within which the reactions are 
linear with time. The final selected reactions contain 10 ng baculovirus-expressed 
hIKKB, 100mM cold ATP and 1 ul [y-°P]ATP (3,000 Cimmol”!, 1 mCi per 
100 pl) in 25 ll of reaction buffer containing 50 mM Tris-HCl at pH 8.0, 100 mM 
NaCl, 10 mM MgCl, and 2mM DTT. The phosphotransfer reaction was allowed to 
proceed for 10min at 30°C and quenched with SDS-PAGE sample buffer. The 
products were separated on 15% SDS-PAGE and subjected to autoradiography. The 
relative amounts of phosphorylated IkBa were quantified using IMAGE], plotted 
against total IkBo concentrations and fitted using nonlinear regression to the 
Michaelis-Menten equation to obtain K,, using SIGMAPLOT. 

Transfection, immunoprecipitation and kinase assay. The constructs Flag- 
hIKKf EE and its truncation mutants; HA-hIKKB EE and its dimerization 
mutants L654D/W655D, W655D/L658D and L654D/W655D/L658D; and 
Flag-hIKK8 and its dimerization mutants L654D/W655D and W655D/L658D 
were generated in the vector pcDNA3 by conventional PCR. All IKK constructs 
were transfected in HEK293T cells with Lipofectamine 2000 (Invitrogen). After 
24h, cell extracts were immunoprecipitated with anti-Flag antibodies bound to 
agarose beads (M2, Sigma) or anti-HA bound to agarose beads (Sigma). IKKB 
kinase assays were essentially done as previously described®”. Briefly, immuno- 
precipitates were incubated with 2 1M full-length IxBo (1-317) in 20 mM HEPES 
at pH 7.5, 10mM MgCl, 20mM f-glycerophosphate, 10 mM PNPP, 50mM 
Na3VO,y, 1mM DTT, 20mM ATP, and 1-10-mCi [y-**P]ATP at 30°C for 
30 min, and subjected to SDS-PAGE and autoradiography. Immunoblotting 
was performed using anti-Flag (Sigma), anti-HA (Sigma) or anti-IKKf antibodies 
(Upstate, 05-535). 

Equilibrium analytical ultracentrifugation measurements. Experiments were 
performed in a Beckman XL-A/I analytical ultracentrifuge (Beckman-Coulter), 
using six-cell centre pieces with straight walls, a 12-mm path length and sapphire 
windows. Samples were kept and diluted in 50mM Tris-HCl at pH 8.0 and 
300mM NaCl. Samples from wild-type protein were diluted to 6.9, 4.5 and 
2.4.M, mutant L654D/W655D to 7.4, 4.8 and 2.6uM and mutant W655D/ 
L658D to 4.9, 3.2 and 1.7 1M for channels A, B and C, respectively. Dilution 
buffer was used as blank. All samples were run at 4 °C at 9,000 r.p.m. (5,900g; held 
for 20h then scanned four times at 1-h intervals), 11,000 r.p.m. (8,800g; held for 
10h then scanned four times at 1-h intervals), 14,000 r.p.m. (14,300g; held for 
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10h then scanned four times at 1-h intervals) and 17,000 r.p.m. (21,000g; held for 
10h then scanned four times at 1-h intervals). Detection was by ultraviolet 
absorption at 280 nm. Solvent density and the protein partial specific volume 
at each temperature were determined using the program SEDNTERP (Alliance 
Protein Laboratories). For calculation of Kp and the apparent molecular weight, 
all useful data were used in a global fit, using the program HETEROANALYSIS, 
obtained from University of Connecticut (http://www.biotech.uconn.edu/auf). 
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The nucleobase/ascorbate transporter (NAT) proteins, also known 
as nucleobase/cation symporter 2 (NCS2) proteins, are responsible 
for the uptake of nucleobases in all kingdoms of life and for the 
transport of vitamin C in mammals’”. Despite functional character- 
ization of the NAT family members in bacteria, fungi and mammals, 
detailed structural information remains unavailable. Here we report 
the crystal structure of a representative NAT protein, the Escherichia 
coli uracil/H* symporter UraA, in complex with uracil at a resolu- 
tion of 2.8 A. UraA has a novel structural fold, with 14 transmem- 
brane segments (TMs) divided into two inverted repeats. A pair of 
antiparallel B-strands is located between TM3 and TM10 and has an 
important role in structural organization and substrate recognition. 
The structure is spatially arranged into a core domain and a gate 
domain. Uracil, located at the interface between the two domains, is 
coordinated mainly by residues from the core domain. Structural 
analysis suggests that alternating access of the substrate may be 
achieved through conformational changes of the gate domain. 

Representative NAT/NCS2 family proteins include the vitamin C 
transporters SVCT1 and SVCT2 in mammals**; the uracil transporter 
SNBT1 in rat®; the uric acid/xanthine transporter UapA in Aspergillus 
nidulans’; the UapA homologue, YgfO (also known as XanQ), in 
E. coli®; and the uracil and 5-flurouracil transporter UraA’ (Supplemen- 
tary Fig. 1). NAT proteins contain a signature motif, (A, G, S)(Q, E, 
P)N-X-G-X,-T(R, K, G), where X denotes a nonspecific amino acid, 
that may be involved in substrate recognition and transport'®”. 
Whereas the mammalian NAT proteins are sodium symporters*, their 
bacterial homologues are proton symporters*. Uptake of nucleobases 
can also be mediated by members of the nucleobase/cation symporter 1 
(NCS1) family ''°, which share little sequence similarity with NAT/ 
NCS2 proteins. The NCS1 transporter Mhp1 is structurally similar to 
LeuT'*”®. It remains to be seen whether the NAT/NCS2 transporters 
also conform to the LeuT fold. 

Uracil binds to recombinant UraA protein with a dissociation con- 
stant of approximately 0.41 + 0.07 [1M (s.d. of three independent experi- 
ments) as measured by scintillation proximity assay’’ (Supplementary 
Fig. 2a). Uracil proved to be essential for generation of usable UraA 
crystals grown in the presence of n-nonyl-B-D-glucopyranoside (B-NG). 
We determined the structure using mercury-based single isomorphous 
replacement (Supplementary Table 1). The electron density was of 
excellent quality (Supplementary Fig. 3a) except for that of residues 
179-195, which were modelled as poly-Ala. During structure refine- 
ment, a prominent, disc-shaped electron density reminiscent of uracil 
appeared in the centre of UraA after most amino acids were modelled 
(Supplementary Fig. 3b). Because crystal growth strictly depended on 
the inclusion of 1mM uracil in the protein purification and crystal- 
lization solution, we modelled uracil into the density and refined the 
final atomic model at a resolution of 2.8 A (Supplementary Table 1 and 
Supplementary Fig. 3c). 

UraA contains 14 transmembrane segments, with both the amino 
and the carboxy termini located on the cytoplasmic side (Fig. 1a). The 


two C-terminal o-helices, 713 and «14, are only halfway into the 
membrane. There is no detectable structural similarity between 
UraA and Mhp1"™. Searching the Protein Data Bank using the Dali 
server’* suggested that UraA has a novel fold (Supplementary Fig. 4a). 
Notably, two short, antiparallel B-strands on TM3 and TM10 are 
located at the centre of the structure. Each B-strand is preceded by 
an extended, unwound fragment and followed by a short «-helix, «3 or 
10 (Fig. 1b). Discontinuous helices have been observed in a number 
of transport proteins”, but it is unusual for an unwound region to 
constitute half of the transmembrane segment. These structural features 
distinguish UraA from other known structures of integral membrane 
proteins. The 11-residue NAT signature motif, which was predicted to 
bea loop in the cytoplasm’””’, constitutes membrane-embedded «10 in 
UraA (Supplementary Fig. 1). 

The 14 transmembrane segments of UraA are arranged into two 
structural repeats, TM1-TM7 and TM8-TM14, which are related to 
each other by a rotation of approximately 180° around an axis parallel 
to the membrane bilayer. These two repeats can be superimposed with 
a root mean squared deviation of 2.9 A over 135 aligned Ca atoms 
(Supplementary Fig. 4b). The presence of two inverted repeats is a 
shared feature for a number of transporters’® and channels”®”’. 

The 14 transmembrane segments are spatially organized into a core 
domain and a gate domain. The core domain comprises TM1-TM4 
and TM8-TM11, and the gate domain contains the other six segments 
(Fig. 2a, left). The interface between the two domains is populated 
mainly with hydrophobic residues from TM1, TM3 and TM8 in the 
core domain and TM5 and TM12 in the gate domain (Fig. 2a, right). 
Notably, the interdomain interactions are further stabilized by the 
head group of a bound B-NG molecule (Supplementary Fig. 5). We 
added B-NG in the last step of protein purification, whereas uracil was 
included throughout purification and crystallization. Uracil binds 
UraA in the absence of B-NG (Supplementary Fig. 2a), whereas addi- 
tion of B-NG to substrate-free UraA led to severe protein precipitation 
that prevented biochemical analysis. Furthermore, D-glucose, the head 
group of B-NG, had no impact on uracil binding or transport when 
measured by scintillation proximity assay or cell-uptake assay. Thus, 
we concluded that uracil-bound UraA may provide a suitable con- 
formation for B-NG to bind to and that B-NG binding is unlikely to 
have an impact on uracil recognition. 

In contrast to the hydrophobic interface between the two domains, 
there are a large number of buried hydrogen bonds within the core 
domain, with the B-strands of TM3 and TM10 at the centre (Fig. 2b). 
Apart from conventional hydrogen bonds in antiparallel B-strands, the 
hydroxyl group of Tyr 288 in TM10 makes two hydrogen bonds with 
the carbonyl oxygen and amide nitrogen of Ser 71 in TM3. There are 
12 additional hydrogen bonds between TM3/TM10 and other trans- 
membrane segments (Fig. 2b). These extensive interactions may facili- 
tate the conformation of the unwound regions in TM3 and TM10 as 
well as restraining movement of the transmembrane segments in the 
core domain. 
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Figure 1 | The structure of UraA reveals a novel fold. a, Overall structure of 
UraA. Two perpendicular views, one from the periplasm and one from the side, 
are shown. The bound uracil is indicated. b, A pair of short, antiparallel 
f-strands is located in the middle of TM3 and TM10. The 2F, — F, electron 


The antiparallel §-strands and the connecting loops of TM3 and 
TM10 provide a shelter for uracil between the core and gate domains. 
The pyrimidine ring of uracil, which is roughly parallel to the B-strands, 
is surrounded by negative electrostatic potentials (Fig. 3a). Recognition 
of uracil is almost exclusively mediated by the core domain, involving 
residues from TM1, TM3, TM8, TM10 and TM12 (Fig. 3b and Sup- 
plementary Fig. 4c). Two Glu residues, Glu 241 and Glu 290, anchor 
uracil by each making two hydrogen bonds with it (Fig. 3b). Replace- 
ment of either Glu residue by Ala completely abrogated uracil binding 
(Supplementary Fig. 2b). In addition, the two oxygen atoms of uracil 


Uracil 


density map, shown in blue mesh, is contoured at 1¢ on the left. All structural 
figures, including the calculation of surface electrostatic potential, were 
prepared with PYMOL”. 


form hydrogen bonds with the amide nitrogen atoms of Phe 73 and 
Gly 289. Notably, Gly 289 and Glu 290 are within the NAT signature 
motif. A hydroxyl group in B-NG forms hydrogen bonds with both the 
imidazole nitrogen of His245 and the carbonyl oxygen of uracil 
(Supplementary Fig. 5b), suggesting that a water molecule may occupy 
the position of the B-NG hydroxyl group and make similar interactions. 
In this case, His 245 may contribute to uracil binding with a water- 
mediated hydrogen bond (Supplementary Fig. 5c). Consistent with this 
notion, substitution of Ala for His 245 abolished uracil binding (Sup- 
plementary Fig. 2b). 


Figure 2 | Domain organization of UraA. a, Left: UraA is spatially organized 
into core (grey) and the gate (cyan and blue) domains that associate with each 
other through hydrophobic interactions. Right: the residues that mediate the 

interdomain interactions are shown as green (core) and blue (gate) sticks in the 
right panel. Uracil is shown as yellow spheres. b, Antiparallel B-strands in TM3 
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and TM10 provide the organizing centre for the core domain. There are a large 
number of hydrogen bonds between transmembrane segments within the core 
domain. Two perpendicular views are shown. Hydrogen bonds are represented 
by red dashed lines. 
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Figure 3 | Uracil coordination by UraA. a, Uracil is located in a concave 
pocket surrounded by negative electrostatic potential. The core domain of 
UraA is shown in surface electrostatic potential. b, Uracil is coordinated by both 
polar (left) and van der Waals (right) contacts. Uracil is shown in yellow, in 
ball-and-stick form. c, Cell-based *H-uracil-uptake assay identified the key 
residues in uracil transport. The membrane expression levels of UraA variants 
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In addition to hydrogen bonds, uracil is coordinated by van der 
Waals interactions involving Ala31, Phe 73, Tyr288 and Tyr 342. 
Phe 73 blocks access to uracil from the periplasm, whereas the phenyl 
ring of Tyr 288 is roughly parallel to the pyrimidine ring of uracil. 
Tyr 342 is the only residue from the gate domain that contributes to 
uracil binding (Fig. 3b). The presence of aromatic residues surround- 
ing the substrate is commonly observed in membrane transporters, 
such as the sodium-coupled sugar transporter vVSGLT”’, the glycine 
betaine transporter BetP?? and the arginine/agmatine antiporter 
AdiC*”*. The bulky residues effectively insulate the substrate from 
the outside environment. 

To corroborate structural observations, we generated a number of 
UraA variants, each containing a mis-sense mutation, and examined 
their ability to transport uracil into E. coli. Whereas the wild-type 
UraA can mediate the uptake of 3H-uracil with a Michaelis constant, 
Ky» of approximately 0.5 1.M (Supplementary Fig. 6), replacement of 
Glu 241, His 245 or Glu 290 by Ala invariably abrogated uracil uptake 
(Fig. 3c). This finding is consistent with loss of uracil binding by these 
three mutants (Supplementary Fig. 2). By contrast, the UraA variants 
Phe 73 Ala and Tyr 342 Ala retained the bulk of transport activities 
(Fig. 3c). 

Mutation of the invariant Asn residue in the NAT motif was shown to 
abrogate the transport function for UapA and YgfO"”’. In UraA, how- 
ever, only Gly 289 and Glu 290 of the NAT motif are directly involved in 
substrate binding. Asn 291 is located away from the substrate-binding 
site (Supplementary Fig. 7). The UraA variant Asn 291 Ala retained the 
majority of the wild-type uracil-uptake activity (Fig. 3c). Because UapA 
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Figure 4 | Working model for the transport mechanism of UraA. a, Uracil- 
bound UraA is in an inward-open conformation. The van der Waals surface of 
UraA (purple) was calculated with the program HOLE”, which revealed that 
the bound uracil is insulated from the periplasm but exposed to the cytoplasm. 
The pore radii along the potential transport path are tabulated on the right. 
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were monitored by western blot using an antibody against the His tag. The 
amount of each protein was estimated by comparing the intensity against a 
serial dilution of UraA with known concentration on the same western blot. 
The reactions lasted for 30s. Details of the experiments are described in 
Supplementary Fig. 6 and Methods. WT, wild type. Error bars, s.d. of three 
independent experiments. 


and YgfO are xanthine/uric acid permease, the invariant Asn residue 
in the NAT motif may have a more important role in these purine 
transporters. 

UraA is a proton-coupled symporter. Translocation of proton relies 
on the residues that can be protonated and deprotonated. Remarkably, 
Glu 241, His 245 and Glu 290, all of which can undergo cycles of 
protonation and deprotonation, are clustered at the interface between 
the core and gate domains (Supplementary Fig. 8) and are essential for 
uracil binding (Fig. 3b and Supplementary Fig. 2b). This arrangement 
suggests that these residues may have a key role in proton transloca- 
tion. Curiously, however, only Glu 241 is invariant among all NAT 
members. His 245 is replaced by Asp, Thr or Valin other NAT proteins, 
whereas Glu 290 is substituted by Gln or Pro (Supplementary Fig. 1). A 
closer examination revealed a pattern of function-based conservation. 
Glu 290 is conserved in the three known pyrimidine permeases RutG, 
PyrP and SNBT1. His 245 is conserved in RutG and PyrP, both of which 
are pyrimidine/proton symporters. By contrast, His 245 is replaced by a 
conserved Asp residue in the sodium symporters SNBT1, SVCT1 and 
SVCT2 (Supplementary Fig. 1). This analysis suggests conserved 
mechanisms of transport, with His 245 having a key role in proton 
symport and Asp having a similar role in sodium symport. 

An analysis of solvent accessibility using the program HOLE” 
unambiguously shows that uracil is occluded from the periplasm but 
readily accessible from the cytoplasmic side (Fig. 4a). Thus, our current 
structure of uracil-bound UraA adopts an inward-open conformation. 
To load uracil, UraA must also be able to adopt an outward-open 
conformation. This is probably accomplished by a rigid-body rotation 


Conformational change of the gate domain 
3 


Periplasm 
Protonation Deprotonation 
Uracil binding Core Uracil release 
&Y domain 
ate aX. Cytoplasm 
domain ¢g @ 
Uracil-bound UraA: Substrate-released UraA: 


inward open 
(structure available) 


inward open 
(transient?) 


b, Working model to illustrate the putative transport mechanism of UraA. 
Glu 241, His 245 and Glu 245 are shown to emphasize their essential role in 
substrate recognition and proton translocation. For simplicity, only TM5 and 
TM12 are shown to represent the gate domain. 
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of the gate domain relative to the core domain around the bound uracil 
(Fig. 4a). 

On the basis of structural and biochemical analyses, we propose a 
working model to explain proton-coupling and uracil symport by UraA 
(Fig. 4b). The default conformation of substrate-free UraA may be 
outward-open, with the two Glu residues deprotonated. The negative 
charges on the Glu residues make closure of the gate domain onto the 
core domain energetically unfavourable. On binding of uracil and H™, 
at least one Glu residue is protonated and the gate domain undergoes a 
conformational change, taking an inward-open conformation as seen 
in the structure. A proton translocation is likely to occur from Glu to 
His 245 and onward into the cytoplasm. Deprotonation may cause local 
conformational changes around Glu 241, His 245 and Glu 290, leading 
to the dissociation of uracil. An inward-open, deprotonated conforma- 
tion is likely to be transient and quickly reverts to the outward-open 
state. In the sodium symporters of the NAT family, the sodium ion, 
rather than the proton, is required to neutralize the conserved Glu and 
Asp residues and for substrate binding. 

This model predicts that the core domain provides the molecular 
basis of substrate selectivity and proton/sodium translocation, whereas 
conformational changes of the gate domain allow transport of sub- 
strate. We recognize the speculative nature of this model, as many 
important questions remain unanswered and require experimental 
investigation. For example, we do not know the molar ratio between 
the proton/sodium ion and the substrate molecule during each trans- 
port cycle, nor do we know how the conserved Glu and His residues 
trigger conformational changes during protonation and deprotona- 
tion. Nonetheless, the structural and biochemical characterizations 
of UraA reported here provide an important framework for mech- 
anistic understanding of the NAT family transporters. 


METHODS SUMMARY 


We generated all clones using a standard PCR-based cloning strategy. Wild-type 
and mutant UraA were expressed and purified to homogeneity. We grew crystals 
of wild-type UraA by the hanging-drop vapour diffusion method in the presence 
of 0.4% B-NG (Anatrace). All data were collected at SPring-8 beamline BL41XU 
and processed with the HKL2000 package”’. The structure was determined by Hg- 
SIRAS using SOLVE” and refined with PHENIX”’. We measured the binding 
affinity between uracil and purified recombinant UraA by scintillation proximity 
assay’’. The cell-based uracil-uptake assay was modified on the basis of a published 
protocol’. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Protein preparation. The complementary DNA of full-length UraA from E. coli 
strain 0157:H7 was subcloned into pET21b vector (Novagen). The UraA mutants 
were generated using two-step PCR and subcloned, overexpressed and purified in 
the same way as wild-type protein. Overexpression of UraA was induced in E. coli 
BL21(DE3) by 0.2 mM isopropyl-f-p-thiogalactoside (IPTG) when the cell density 
reached Deoonm 1.5. To obtain the structure of uracil-bound UraA, uracil (Sigma) 
was added at 1 mM when cells were induced. Uracil (1 mM) was included in all the 
buffers during protein purification. After growth for 16h at 22 °C, the cells were 
collected, resuspended in buffer containing 25 mM Tris-HCl, pH 8.0, and 150 mM 
NaCl, and disrupted using a French press with two passes at 10,000-15,000 p.s.i. 
Cell debris was removed by centrifugation at 27,000g for 10 min. The supernatant 
containing the membrane was collected and underwent ultracentrifugation at 
150,000g for 1 h. The membrane fraction in the pellet was harvested and incubated 
with 1.5% (w/v) n-dodecyl-B-p-maltopyranoside (DDM, Anatrace) for 1h at 4°C. 
After another ultracentrifugation step at 150,000g for 30 min, the supernatant was 
collected and loaded on Ni’*-nitrilotriacetate affinity resin (Ni-NTA, Qiagen). 
Subsequently, the resin was washed three times, each time with 10 ml buffer con- 
taining 25mM Tris-HCl, pH 8.0, 150mM NaCl, 30mM imidazole and 0.02% 
DDM. The protein was eluted from the affinity resin with 10 ml wash buffer plus 
250 mM imidazole. The proteins were concentrated to about 10mgml | before 
undergoing gel-filtration chromatography (Superdex-200 10/30, GE Healthcare) in 
buffer containing 25 mM Tris-HCl, pH 8.0, 150 mM NaCl and the indicated deter- 
gents. The peak fractions were collected for biochemical analyses or crystallization 
trials. 

Crystallization. Crystals were grown at 18°C by the hanging-drop vapour dif- 
fusion method. Crystals in the space group P6422 were obtained for protein puri- 
fied in the presence of 0.4% B-NG. Uracil (1 mM) was included during protein 
expression, purification and crystallization. Crystals appeared overnight in the 
buffer containing 25% PEG400, 100mM MES-NaOH, pH 6.5, and 300mM 
Li:SO,, and typically grew to form 50 tum X 50 jm X 100 um hexagonal rods in 
about one month. The crystals diffracted X-rays beyond 2.9 A at SPring-8 beamline 
BL41XU. Mercury derivatives were obtained by soaking the crystals for 24h in the 
mother liquor containing 1 mg ml ! (C,3HsHgO)HPOs. Both native and heavy- 
atom-derived crystals were directly flash frozen in a cold nitrogen stream at 100 K. 
Data collection and structure determination. All the data were collected at 
SPring-8 beamline BL41XU and processed with the HKL2000 package’’. Further 
processing was carried out with programs from the CCP4 suite*’. Data collection 
statistics are summarized in Supplementary Table 1. The initial phases were 
obtained from the native P6,22 crystal and its Hg derivative by single isomorphous 
replacement with anomalous scattering (SIRAS) using the program SOLVE”. 
Figures of merit before and after density modification are 0.244 and 0.749, respec- 
tively. The real-space constraints were applied to the electron density map in DM. 
The model was built manually in COOT*® and the structure was refined with 
PHENIX”. All structure figures, including the calculation of surface electrostatic 
potential, were prepared with PYMOL”. 

Cell-based uracil-uptake assay. The uraA-deficient E. coli strain Keio Collection 
JW2482 (F-, A(araD-araB)567, AlacZ4787(::rrnB-3), &lambda-, AuraA745::kan, 
rph-1, A(rhaD-rhaB)568, hsdR514) used in this assay was purchased from 
National BioResource Project (Japan). Wild-type and UraA mutants were sub- 
cloned into pQLINK vector™ with His, tag at the C terminus. The membrane 
expression levels of UraA variants were monitored by western blot using an 
antibody against the His tag. The amount of each UraA protein in the membrane 
fraction was estimated by comparing the intensity of the proteins in the membrane 
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fraction against a serial dilution of purified UraA with known concentrations on 
the same western blot, a protocol reported previously”. 

The cell-based uptake assay was performed following the published protocol with 
minor modifications’. uraA-deficient E. coli cells transformed with plasmids were 
grown in LB medium at 37 °C and induced with 50 uM IPTG when the cell density 
reached Dgoonm 1.5. Cells were collected after induction for 30 min, rinsed and 
resuspended to an adjusted Dgoo nm 2.0 in AB medium (a modified minimal medium; 
see http://openwetware.org/wiki/AB_medium for the detailed recipe). After incuba- 
tion in AB medium for 1h, the cells were taken for the uracil-uptake assay. 

For time-course experiments, [5,6-°H]-uracil (4Ci mmol !, American 
Radiolabelled Chemicals) was added at indicated concentrations. All the reactions 
were performed at 25°C. At the indicated reaction time, an aliquot of cells was 
taken for filtration through 0.45-1m cellulose acetate filter (Sartorius). The filter 
membranes were immediately washed with 2 ml ice-cold AB medium, dried and 
taken for liquid scintillation counting. Control experiments were performed with 
cells transformed with pQLINK empty vector. For each data point, a parallel 
control experiment was performed and the control flux was subtracted before data 
fitting. 

The time-course experiments showed that the accumulation of uracil was roughly 
linear within the first 30-60 s. Therefore, to determine the K,, and V,,, of uracil 
uptake by wild-type UraA, the initial velocities were measured at 30s. All experi- 
ments, including control experiments, were repeated at least three times, and the data 
were fitted to the Michaelis-Menten equation, V = Vinax[UraA]/(Km + [UraA]), in 
GRAPHPAD PRISM 5.0 Demo. 

To compare the transport activity of the UraA variants, [5,6-°H]-uracil 

(40 Cimmol”') was applied at 0.25 j1M and each reaction was allowed for 30s. 
All the UraA variants were expressed and quantified following the same protocol 
as for wild-type UraA. The solution behaviour of the purified UraA variants was 
examined by size exclusion chromatography, which showed similar profiles to that 
of wild-type protein. 
SPA-based binding assay. Scintillation proximity assay (SPA) beads were diluted 
to 2mg ml! in 150mM MES-NaOH, pH 6.4, 50 mM NaCl, 20% glycerol, 2 mM 
TCEP (Sigma) and 0.05% DDM. About 400 ng of purified, C-terminal His,-tagged 
UraA protein was incubated with [5,6-*H]-uracil at 4 °C for 2h. The solution was 
then added to 100 ul SPA beads and incubated by vigorous shaking at 4°C in the 
dark for 2h. The mixture was then loaded into individual wells of 96-well plates. 
Scintillation was read in the SPA mode on a Wallac 1450 MicroBeta plate PMT 
counter. For the competition assay, the final concentration of (5,6-°H]-uracil was 
kept constant at 0.17 14M while the concentration of non-labelled uracil was 
increased from 0 to 10 uM. To define the nonspecific binding activity, 400 mM 
imidazole was added to the wells to compete with His, for bead binding. All 
experiments were performed at least three times and data are presented as mean + 
s.d. Nonspecific binding was subtracted from each data point. Data fitting was 
performed using GRAPHPAD PRISM 5.0 Demo. 
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Redox freezing and melting in the Earth’s deep 
mantle resulting from carbon-iron redox coupling 


Arno Rohrbach! & Max W. Schmidt! 


Very low seismic velocity anomalies in the Earth’s mantle’? may 
reflect small amounts of melt present in the peridotite matrix, and 
the onset of melting in the Earth’s upper mantle is likely to be 
triggered by the presence of small amounts of carbonate’. Such 
carbonates stem from subducted oceanic lithosphere in part buried 
to depths below the 660-kilometre discontinuity and remixed into 
the mantle. Here we demonstrate that carbonate-induced melting 
may occur in deeply subducted lithosphere at near-adiabatic tem- 
peratures in the Earth’s transition zone and lower mantle. We show 
experimentally that these carbonatite melts are unstable when 
infiltrating ambient mantle and are reduced to immobile diamond 
when recycled at depths greater than ~250 kilometres, where mantle 
redox conditions are determined by the presence of an (Fe,Ni) metal 
phase**. This ‘redox freezing’ process leads to diamond-enriched 
mantle domains in which the Fe°, resulting from Fe** dispropor- 
tionation in perovskites and garnet, is consumed but the Fe** pre- 
served. When such carbon-enriched mantle heterogeneities become 
part of the upwelling mantle, diamond will inevitably react with the 
Fe** leading to true carbonatite redox melting at ~660 and ~250 
kilometres depth to form deep-seated melts in the Earth’s mantle. 

The Earth’s mantle is a heterogeneous, marble-cake-like composite 
of pristine, depleted, and mostly pyroxenitic or eclogitic recycled 
material’*, further complicated by heterogeneities left behind in 
ancient melt passageways (for example, dunites, cumulative pyroxe- 
nites, or carbon-enriched domains formed through redox freezing). 
These heterogeneities are different from average mantle in terms of 
bulk composition’, or formed through differences in H,O or CO 
volatile influx. Melting processes in the deep mantle are triggered by 
such heterogeneities, as solidus temperatures for fertile or depleted 
peridotite are significantly higher than adiabatic temperatures at 
depths >120km. Using high pressure experiments, we investigate 
the formation of C-enriched mantle domains in the vicinity of deeply 
subducted lithosphere (through redox reactions between recycled car- 
bonate and ambient metal-bearing mantle) and the remelting of such 
domains when entrained into the convective mantle. 

Carbonated peridotite systems are central to understanding processes 
that involve the recycling of subducted, carbonated lithosphere back 
into the mantle. Experiments have shown that at pressures above 
~2.5 GPa, carbonated mantle melts at lower temperatures than carbon- 
ate-free mantle**'*. However, these experimental studies have main- 
tained carbon in its oxidized form. To understand what happens within 
the generally reduced deep Earth, it is necessary to include redox equi- 
libria (controlled by oxygen fugacity, fo, ) between intrinsically oxidized 
carbonate minerals or melts and the reduced and metal-bearing deep 
mantle. 

The redox state of the mantle determines whether carbon is present 
in its oxidized and potentially mobile form as carbonate or carbonatite 
melt (which lower the mantle solidus by several hundred degrees*”""”), 
or whether it is present in its reduced and immobile form as graphite or 
diamond (which do not affect melting temperatures). In the upper- 
most mantle to ~250 km depth, fo, is determined by Fe~"/Fe’* equi- 
libria in silicate minerals. Studies of natural peridotite xenoliths show 


that mantle fo, decreases with increasing pressure’*, such that carbo- 
nates or carbonatites are not stable at depths greater than ~120 km in 
subcratonic and asthenospheric mantle'*'’. Fe?*/Fe** equilibria in a 
predominantly magnesian mantle are inefficient in buffering fo, 
because slight changes in the Fe~/Fe’* ratio have a strong impact 
on mantle fo, (ref. 16). Oxygen fugacity in the upper mantle is thus 
affected by processes such as partial melting, mantle metasomatism, 
and recycling of oxidized material by subduction. Small amounts of 
admixed oxidized component may raise the fo, of a limited mantle 
domain such that carbonates become stable and fo, is controlled by 
equilibria like enstatite + magnesite = olivine + graphite/diamond 
(EMOG/EMOD)”. The onset of carbonatite melting depends then 
solely on the solidus of carbonated peridotite. 

This situation changes at higher pressures: thermodynamic calcula- 
tions* and experiments**'* demonstrate that fo, decreases with increas- 
ing pressure such that (Fe,Ni)-metal probably becomes stable at 
~250 km depth” and in all mantle regions below’. At metal saturation 
depths, fo, in the Earth’s mantle becomes narrowly constrained. Given 
equilibrium between mantle phases with molar Mg/(Mg+Fe**) 
(=Xmg) ~ 0.90 and (Fe,Ni) metal, fo, can only vary from values around 
the iron-wiistite (IW) equilibrium where the metal would be Ni-rich, to 
about 1.5 logarithmic units below IW where the metal would be almost 
pure iron. Because these equilibria have a considerable buffer capacity 
and the mantle represents an almost infinite reservoir, the mantle is 
capable of imposing its ambient fo, on any additional redox sensitive 
component, such as carbonates, carbonatites or carbon-hydrogen- 
oxygen fluids. Redox state differences between such fluids and ambient 
peridotite may trigger local hydrous redox melting”’ at shallow mantle 
depths when fluid speciation changes from reduced methane-rich to 
oxidized water-rich. 

At larger mantle depths, small amounts of H,O can be incorporated 
into nominally anhydrous minerals”?! and CO, dominates the low- 
ering of melting temperatures through volatiles”. To explore whether 
redox equilibria involving recycled carbonates trigger the deepest 
melting in the Earth’s mantle, we investigated a carbonated fertile 
mantle at pressures (10-23 GPa), temperatures (1,400-1,900 °C) and 
redox conditions relevant for the mid-upper to lower mantle using 
high-pressure multianvil devices (Methods). 

The first set of experiments constrains the solidus of a carbonated 
fertile peridotite at 10, 14 and 23 GPa to temperatures of 1,535, 1,600 
and 1,675 °C (Fig. 1). Melting temperatures determined previously 
vary considerably, mainly because of variations in the bulk alkali con- 
tents and the use of synthetic analogues*”"'*. Our 10 GPa solidus tem- 
perature corresponds well to the ~1,500°C reported previously for 
10 GPa ina study on natural carbonated peridotite’, and thus provides 
a direct link between our data set and lower-pressure data. The solidus 
from 10 to 23 GPa proceeds parallel to that of alkali-enriched perido- 
tite at 10-20 GPa (ref. 10), although absolute temperatures obtained 
with our composition are 100-150 °C higher (Fig. 1). This difference 
results from our considerably lower, peridotite-like, alkali content and 
a therefore lower alkali/CO, ratio (0.06 this study; 0.18 in ref. 10), 
which is potentially a key variable controlling solidus temperatures’. 
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Figure 1 | The solidus of carbonated peridotite. Solid red line, solidus as 
bracketed by our experiments. Subsolidus runs, open triangles; runs containing 
carbonatite melt, filled inverted triangles. Run pressure uncertainties are smaller 
than symbol size; for temperature uncertainties see Methods. Dashed lines, solidi 
of natural peridotite; dotted line, solidus of a synthetic C©0-MgO-Al,03-SiO, 
(CMAS) + CO, analogue system’. Brown long-dashed line, natural 

peridotite + 2.5 wt% CQ (ref. 3); green short-dashed line, natural peridotite + 
5 wt% CO, enriched with 0.5 wt% Na2O + 0.4 wt% K,O (ref. 10). Solidi 
temperatures decrease with increasing bulk alkali/CO, ratio’, although K appears 
to have a much stronger effect than Na (compare refs 3 and 10 with similar alkali/ 
CO,). Blue bar, range of potential mantle temperatures at 1 bar extrapolated to 
depth (yellow corridor) assuming adiabatic behaviour in a convecting mantle”. 
The solidus of carbonated peridotite approaches the geotherm at ~10 GPa (ref. 3). 
As at higher pressures the solidus continues within the range of adiabatic 
temperatures, we suggest that carbonatitic melts may form at ambient mantle 
temperatures over a large pressure interval if carbonate is stable with respect to fo, . 


The steep positive slope of the solidus in a pressure-temperature (P-T) 
plot—55 °C per GPa up to 10 GPa (ref. 3)—changes between 10 and 
15 GPa to a rather weak temperature dependence of 12 °C per GPa at 
>15 GPa. We attribute this change to an increase in activity with 
pressure of mainly Na,O but also CaO, caused by the continuous 
decrease in modal abundance of clinopyroxene. Above 15 GPa, 
pyroxene completely dissolves in majorite-garnet solid solution”. 
Sodium is relatively incompatible in the garnet structure and partitions 
strongly into a carbonatite liquid, thus causing relatively low melting 
temperatures. The solidus of carbonated peridotite approaches the 
mantle geotherm at about 10 GPa and from then on remains close 
to ambient adiabatic mantle temperatures (Fig. 1). Carbonatite melting 
in the deep Earth’s mantle does therefore not require anomalously 
high temperatures and carbonatites may be produced over large depth 
intervals. 

In two further series of experiments, we determined carbon speciation 
as a function of fo, for slightly subadiabatic mantle temperatures. For 
this, we first equilibrated the carbonated peridotite with Fe-FeO, Ni- 
NiO, (Ni,Au)—NiO and Re-ReOQ, solid state metal-metal oxide buffers 
at 14 GPa, 1,450 °C and 23 GPa, 1,600 °C. Fe, Ni, Ni-Au and Re metal 
were used as capsule materials, and the respective oxide was either 
contained in the starting material, or (in the case of ReOz) added to 
the charge. Oxygen fugacities imposed by these buffers range from -1 to 
+5 log units relative to IW (Methods). Experiments under oxidizing 
conditions (IW +4 to +5) in Re and Au-Ni capsules yield almost pure 
magnesite (X)g = 0.96), whereas runs in Fe and Ni capsules reveal that 
carbonate is not stable at fo, <IW +1.2 at either 14 or 23 GPa (Fig. 2). 
Instead, carbonate is reduced to micrometre-sized diamonds, identified 
in backscatter electron images and measured using energy-dispersive 
spectroscopy. Elevated FeO and NiO contents and fractions in the 
minerals provide further evidence for carbonate reduction via 


MgCO; + 2(Fe,Ni)° = 3(Fe,NiMg)O + C (1) 


where (Fe,Ni,Mg)O is a compound in olivine, garnet and perovskite or 
is ferropericlase. 


2 | NATURE | VOL 000 | 00 MONTH 2011 


Temperature (°C) 


1,400 1,450 1,550 1,600 
6k... Re-ReO, 
oe Preeueeae 
NENIO 2 2 cae Sih, 
ES f] Magnesite aa: 
© ee ie ee eee A 
2 EMOD f pe eee | 
@ a agnesite Oo 
S al oO + diamond 
o . 
is} Diamond 
=: iy i 
OF 
is) 1 1 i 1 


8 10 12 14 #16 #18 20 22 24 26 


Pressure (GPa) 


Figure 2 | Carbon speciation in natural mantle as a function of pressure, 
temperature and fo,. Plotted are calculated log fo, values relative to the IW 
reference (Methods) for magnesite-bearing runs (red symbols), diamond- 
bearing runs (blue symbols) and runs containing diamond + magnesite using 
Ir as redox sensor (black symbols). Coloured areas mark the stability fields of 
magnesite (yellow), magnesite+ diamond (white) and diamond (light grey). 
Run durations were at least 24h to ensure redox equilibrium between the 
phases (Supplementary Table 1). The stability field of magnesite remains at =2 
log units above IW up to 23 GPa; no strong pressure dependence is evident 
from our data. At the low fo, conditions estimated for the Earth’s mantle which 
lead to metal saturation at pressures higher than ~8 GPa (corresponding to 
~250 km depths; dark grey area), carbonates and carbonatites are unstable and 
will be reduced to diamond according to reaction (1). Relative positions of [W 
and Ni-NiO buffers calculated after ref. 33; for relative position of Re-ReOs, see 
Methods. The EMOD buffer dominating in the uppermost mantle’’ (equation 
given in the text) is limited to ~14 GPa by the stability of olivine and pyroxene. 
Run pressure uncertainties are within the size of the symbols, as are propagated 
2o errors for fo, unless error bars are reported. Errors include analytical 
uncertainties and 20 standard deviations from averages for runs where fo, was 
calculated for multiple phases separately, but do not include uncertainties of the 
activity models which are difficult to quantify. Because of this, we suggest that 
an additional uncertainty for all fo, values reported is +0.3 log units. 


Additionally, we monitored fo, in experiments saturated in both 
diamond and carbonate at 10-23 GPa by adding 5 wt% iridium metal 
as a redox sensor™. The Fe content in Ir, together with the FeO content 
in silicates or oxides, allows the calculation of fo, conditions during the 
experiment (Methods). The carbonated peridotite was saturated with 
elemental carbon through the use of graphite capsules that transformed 
to diamond or, for noble metal capsules, through addition of 10 wt% C. 
We also added 3 wt% Fe° to enhance the mass reacted through equi- 
librium (1). These redox sensing experiments yield fo, values slightly 
lower than calculated for the EMOD equilibrium at 10 and 14 GPa and 
then remain rather constant at 2.3—-2.7 log units above IW up to 23 GPa. 
This is consistent with the fo, corridor between IW +1.2 to +4, as 
defined by the externally buffered experiments, in which the coexist- 
ence of diamond+ magnesite would be possible (Fig. 2). 

Our results imply that at an ambient, fertile mantle fo, around IW - 
1.5 (ref. 14), carbonate that is remixed into the mantle at >250km 
depth is unstable and will be reduced to diamond. Typical carbon 
concentrations of 20-250 p.p.m. C for sub-ridge mantle (ref. 25 and 
references therein) would inevitably dissolve in the metal phase pre- 
sent at these depths or form discrete iron carbides, like Fe;C and Fe7C3, 
depending on Fe-C ratio and P-T conditions”. Thus, carbonate 
related melting is unlikely to occur in Earth’s lower mantle, the transi- 
tion zone and the lowermost upper mantle as long as sub-ridge carbon 
concentrations prevail (Fig. 2). Although the mantle ceases to be metal 
saturated in shallower parts of the upper mantle (<250 km, ref. 6), the 
average fo, up to 100-150 km depth presumably remains too low to 
sustain carbonates or carbonatites as equilibrium phases at sub-ridge 
carbon concentrations”». 
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However, the addition of subducted carbonate to average mantle 
fundamentally changes this behaviour. Relative buffer capacity 
changes in the Earth’s mantle induce immobilization of carbonatite 
melts through ‘redox freezing’, that is, reduction of carbonatites to 
diamond, as well as remobilization of carbon through redox melting. 
Redox melting transforms diamond to carbonatite melts, which 
potentially control the onset of ultra-deep melting. Starting from a 
subducting, locally carbonated, relatively oxidized mafic to ultramafic 
lithosphere, our experiments demonstrate that carbonatite melts will 
be generated in such lithosphere on thermal relaxation (Fig. 1). This 
may occur when the lithosphere deflects into the transition zone above 
the 660-km discontinuity or when stagnating in the lower mantle. Ona 
local scale, oxidized carbonatite melt migrating into the mantle will 
consume metal to first form iron carbide in an intermittent stage, and 
then further oxidize the Fe and Ni contained in the carbide to leave a 
mantle domain that contains all iron as Fe” and Fe** in silicates and 
ferropericlase and all carbon as diamond. Owing to its low viscosity 
and high wetting properties’’”*, any excess carbonatite not consumed 
by redox reactions would percolate upwards along grain boundaries 
and exhaust further (Fe,Ni)-metal and carbide until complete redox 
freezing—that is, immobilization due to reduction of CO, to Cis 
achieved. This presumably very efficient process will eventually ex- 
haust all buffering metal and carbide through precipitation of dia- 
mond, and result in a metal-free mantle domain where diamond 
coexists with Fe’ * -bearing garnet or perovskite (Fig. 3). 

At the boundary of such domains, where the supply of carbonatite 
melt does not exceed the redox capacity of Fe,Ni-metal, an iron carbide 
rim is expected to form. The redox capacity of Fe** in such mantle 
domains would be exactly equivalent or slightly superior (due to the 
Fe** present before disproportionation of Fe**) to that necessary to 
re-convert all diamond to CO). Similarly, the maximum increase in C 
content of mantle domains metasomatized by carbonatites derived 
from the subducting lithosphere is restricted to ~1,000 p.p.m., equi- 
valent to the amount of carbonatite that may be immobilized by the 
maximum Fe-metal content of 1 wt% expected in the lower mantle’. 


Figure 3 | Carbonatitic redox freezing and redox melting caused by redox 
capacity changes in Earth’s mantle. Main panel, cartoon illustrating a possible 
sequence of redox freezing and redox melting events driven by oxidation state 
contrasts between subducted lithosphere and ambient asthenospheric mantle. 
Right, potential mantle fo, (red line) and redox buffer capacity (blue line) as 
function of depth. fo, decreases in the subcratonic upper mantle’’ to reach 
~IW and metal saturation at ~250 km depth®"*. Change in redox control from 
Fe’*/Fe** in the upper mantle to Fe"-FeO provides a considerable gain in fo, 
buffer capacity. At >250 km depth, the mantle and all redox sensitive 
components are bound to fo, levels <IW unless all metal is exhausted. 
Majoritic garnet in equilibrium with Fe° incorporates more Fe** with 
increasing pressure®; adequate charge balance results from reduction of FeO to 
Fe° which increases the Fe° metal abundance. An estimate yields 0.5 wt% metal 
at the base of the transition zone, assuming all Ni is present as Ni’ (1,800 p.p.m.) 
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As the metal content is expected to decrease with decreasing pressure 
(Fig. 3), the corresponding maximum C content of metasomatized 
domains would also be lower for domains formed at shallower depth. 

The inverse process, redox melting, would occur when such mantle 
heterogeneities are entrained by upwelling mantle and cross the 660-km 
discontinuity and transition zone. Destabilization of Fe** -rich perovskite 
without an adequate amount of Fe° for comproportionation (the 
opposite process to disproportionation) inevitably leads to a sudden 
increase in Fe** activity at the 660-km discontinuity and thus to the 
re-oxidation of diamond to CO;. This redox reaction would directly 
result in carbonatite melts within upwelling, previously melt-free but 
diamond-bearing mantle. Carbonatitic redox melting might also 
explain the virtual absence of pyrolitic mineral inclusions in diamonds 
originating from sublithospheric depths (>200-300 km)”. Pyrolitic 
pristine mantle itself does not contain sufficient C to form diamonds”, 
and thus diamonds are expected to form dominantly through the 
above redox freezing process wherever carbonatites percolate. The 
inverse process—that is, redox melting which destroys most of these 
diamonds through oxidation—then leads to the generation of melts 
that carry remnants of such deep mantle domains to the Earth’s sur- 
face. Average mantle is expected to contain ~1 wt% Fe° formed from 
Fe’* disproportionation in the lower mantle*; such metal fractions are 
stable and would not segregate to Earth’s core*’. For reaction (1), the 
redox capacity of 1 wt% Fe° is equivalent to 0.8 wt% magnesite. As 
argued above, the properties of carbonatite melt result in a self- 
regulating mechanism, where infiltrating carbonatite melt oxidizes 
all Fe’, leaving behind a diamond-bearing mantle domain with exactly 
the same redox-capacity (that is, that of 0.8 wt% magnesite). In the 
reverse process, it can thus be expected that about 1 wt% of carbonatite 
melt forms in such upwelling mantle domains. Although upwelling 
occurs at speeds comparable to plate tectonic movements (that is, 
1-10cm yr), 1% low viscosity melt in the mantle matrix rises with 
speeds of at least 10-100 myr | (ref. 31). Consequently, carbonatite 
flow will tend to escape from the upwelling mantle matrix but will 
suffer redox freezing as long as progressing carbonatite melts encounter 


Buffer capacity 


FeS+/Fe2+ 


control “| 


control 


-1.50 +3 61 0.5 0.10) 
Rel. to |W Metal (wt%) 


plus ~3,200 p.p.m. Fe® calculated from mass balance (mantle: 8 wt% FeO"; 
40 vol% majoritic garnet: 5.5 wt% FeO", Fe? */DFe = 0.33 (ref. 6); 60-vol.% 
ringwoodite: 9.7 wt% FeO'™", Fe**/ZFe = 0.02 (ref. 14)). Transition to the 
lower mantle is accompanied by a gain in buffer capacity because MgSi- 
perovskite incorporates more Fe** at metal saturation fo, than majoritic 
garnet, resulting in ~1.0 wt% metal’. Remixing of subducted carbonated 
lithosphere into the mantle at these depths leads to redox freezing and 
diamond-enriched mantle domains. Domains incorporated into upwelling 
mantle will experience redox melting, most prominently at levels where the 
capacity of the mantle phases to incorporate Fe’* changes drastically, that is, at 
~660 km and ~250 km depth. Melts produced from C-enriched domains 
cannot escape through the surrounding metal-bearing average mantle, but will 
freeze until redox control shifts from metal saturation back to bulk Fe?*/Fe** 
controlled at ~250 m depth. 
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metal-bearing mantle. It is only at depths <250km when garnet 
becomes less majoritic, Fe** activity increases, and the mantle ceases 
to be metal-saturated, that this scenario changes. At this depth, fo, 
buffering in the mantle shifts back from being controlled by Fe°/ 
Fe” ‘to being controlled by Fe**/Fe’*, which leads to a dramatic 
decline in buffering capacity. Such mantle has little potential to hinder 
carbonatite melts from percolating upwards. 

We expect that redox equilibria control the formation of deep car- 
bonatite melts, and that at ~250km depth a continuous flow of car- 
bonatite melt originating from mantle heterogeneities may form in 
upwelling mantle, which is probably reflected in seismic low velocity 
anomalies’” observed at these depths. If melt quantities are too low and 
evade direct seismological observation, their presence may be con- 
firmed by enhanced electrical conductivity or seismic anisotropies 
caused by matrix reorganization’. 


METHODS SUMMARY 

Starting composition. The starting composition for the experiments represents a 
fertile peridotite enriched in 5 wt% CO). NiO in this composition was set to 1 wt%, 
to ensure that the NiO content of the phases was high enough for microprobe 
analysis and fo, calculations. Experiments using Ir as a redox sensor were without 
Ni to ensure that the metal is a binary Fe-Ir alloy. 

Oxygen fugacity calculations. We calculate fo, using Fe-FeO and Ni-NiO redox 
equilibria. fo, values are reported relative to iron-wiistite (IW), nickel-NiO 
(NNO) and rhenium-ReO, (RRO), using activity-composition relationship for 
(Fe,Ni) alloys and (Au,Ni) alloys (Methods). fo, values of ferropericlase (fp)- 
bearing experiments are calculated from 


2M+0, = 2MO (2) 


with M = Fe or Ni. fo, relative to the respective metal-metal oxide (MMO) equi- 
librium is given by 


Alog fo, [MMO] = 2log aeo- 2logamet! (3) 


where activity a is defined as molar fraction X times activity coefficient y. fo, 
conditions of runs containing olivine, wadsleyite or ringwoodite were calculated 
from Fe activity in the olivine polymorph and the coexisting metal phase using 
binary symmetric solution models for activity corrections. 

Iridium redox sensing. Ir metal was used as a redox sensor™ to determine fo, in 
runs containing elemental carbon+magnesite. Margules interaction parameters 
were obtained by least-squares fitting of a binary asymmetric regular solution 
model to X-ray data for face-centred cubic (f.c.c.) Ir-Fe alloy (Methods). 
Activity coefficients y, dependent on P, T and X for Fe in f.c.c. Ir-Fe alloy, are 
given by 


RTIn(yp¢) = 2XpeXt-We tre + Xi, We rer - 2G (4) 


with G™ being the excess Gibbs free energy of mixing, and We pr-re We Fe-tr being 
the Margules parameters. Correcting the activity model for P and T is described in 
Methods and shown in Supplementary Fig. 1. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Starting material. The starting compositions for multianvil experiments used in 
this study represent a fertile peridotite composition” enriched in 5 wt% CO, (Sup- 
plementary Table 2). Starting composition pyr5 was used to determine the solidus 
temperature of carbonated peridotite and for runs where the oxygen fugacity of the 
charge was controlled by the Fe, Ni, and (Au,Ni) capsule materials. Both NiO and 
CoO contents of pyr5 were set to 1 wt%, to ensure that NiO contents of the phases 
were high enough for microprobe analysis and fo, calculation from these data. 
Experiments using Ir metal as redox sensor were performed with the simplified pyrx5 
starting material to ensure that the metal phase in the charge is a binary Fe-Ir alloy. 

Starting composition pyr5 was prepared from reagent grade oxides SiO2, TiOo, 
ALO;, Cr.O3, MgO, fired overnight at 1,000 °C before mixing and grinding under 
acetone in an agate mortar. Part of the CaO inventory was added to the oxide mix as 
reagent grade CaCO; (dried at 220 °C for 5h), mixed and decarbonized by stepwise 
heating to 1,000 °C. Afterwards, iron was added as synthetic fayalite, NiO and CoO 
as oxides, all dried at 220 °C for 5h. CO, was added as a mixture of reagent grade 
CaCO3, MgCO; and Na CO; dried at 220°C for 5h in 61/34/5 wt ratio. All com- 
ponents were mixed and ground again thoroughly under acetone in an agate mortar. 

Starting composition pyrx5 was prepared from reagent grade SiO», Al,O3, MgO 
(all fired at 1,000°C for 5h) synthetic fayalite, synthetic wollastonite, natural 
magnesite (kindly provided by P. Ulmer), and reagent grade NazCO; (all dried 
at 220°C for 5h). 

To ensure nominally anhydrous conditions, the starting materials were stored 
in an desiccator together with silica gel; synthetic rock powders were dried for 2h 
at 220 °C before pressing it into capsules and the filled capsule was dried again for 
1h at 220°C before final welding. 

Multianvil experiments. High pressure experiments (Supplementary Table 1) 
were performed in 600t and 1,000t Walker type multianvil devices. We used 
32-mm tungsten carbide cubes as second stage anvils and natural pyrophyllite 
gaskets as pressure transmitting media. We used an 18/11 assembly for runs at 
10 GPa (18mm octahedron edge length, 11 mm truncated edge length), a 14/8 
assembly for 14GPa and a 10/3.5 assembly for 19 and 23 GPa. All assemblies 
consist of a Cr,O3 doped MgO octahedron with LaCrO; heater (stepped in case 
of 18/11 and 14/8), ZrO, thermal insulation, MgO spacers, and Mo metal plugs to 
ensure electrical contact between assembly and WC cubes. Pressure was computer 
controlled during the entire run; temperature was measured with B- or C-type 
thermocouples and controlled with a Eurotherm controller to about +5 °C. 
Experiments were quenched (by turning off the power supply) at a rate of 
~800 °C per second. The total temperature precision including temperature gra- 
dients within the capsule might be +15°C in 18/11 and 14/8 experiments. 
Temperature gradients within the 10/3.5 assembly might be 30-50 °C over the 
capsule length of ~1.2 mm. Details of pressure calibration, design, and temper- 
ature gradients of the 10/3.5 assembly are given in ref. 35. 

Encapsulation. Runs to determine the carbonated peridotite solidus were performed 
in welded AuggPdyo capsules. No attempt was made to control fo, during these runs. 
The use of conventional Pt-C double capsules appeared to be problematic as the space 
is very limited and recovering the sample from the graphite capsule that transformed 
to diamond during the runs was rather challenging. Mass balance calculations and 
measured Fe contents in the capsule material show that iron loss from the charge to the 
AUgoPdy9 capsule is negligible (see also Xy4g values in Supplementary Table 2). 

Runs using Ir metal as redox sensor were performed in welded AuggPdy9 cap- 
sules. We added ~10 wt% of graphite to the charges to obtain the fo, buffering 
through coexisting C° and carbonate in a peridotite matrix. We added ~5 wt% fine 
grained Ir metal and 3-5 wt% of metallic Fe to ensure redox equilibrium via 
carbonate reduction according to reaction (1). 

Slightly lower Xqig of all phases in fo, monitoring runs (Supplementary Table 2) 
are indicative for reaction (1) and therefore for the attainment of redox equilib- 
rium in these runs. Both diamond and magnesite were present in experimental 
charges in direct contact with each other as verified by backscattered imaging, 
EDX and EPMA analysis. This encapsulation method yields results that agree 
within error to those from experiments in C capsules surrounded by Re metal foil 
(compare fo, of runs 39 and 47; Supplementary Table 5) but sample recovery and 
polishing was facilitated considerably, especially for runs at 23 GPa. 

Fe® and Ni° capsules were not welded but cold sealed at run pressure with a 
conical cap. We added ~3 wt% of the respective metal to the charge to facilitate 
attainment of equilibrium. 

(Au,Ni) alloy capsules were prepared as welded Au outer capsule and Ni foil inner 
capsule which formed an homogeneous (Au,Ni) alloy at run pressures and tempera- 
tures. In runs 22 and 29b we added ~ 10 wt% NiO in order to raise the NiO content of 
the phases and accordingly relative fo, . This NiO-enriched compositions crystallized 
a NiO-rich periclase instead of ferropericlase at 23 GPa (run 29b) and an olivine 
enriched in Ni,SiO4-component at 14 GPa (run 22, see Supplementary Table 2). 
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Analytical methods. Recovered sample capsules (in case of 10/3.5 runs, complete 
octahedra) were mounted in an epoxy resin, ground down with sand paper and 
polished with polycrystalline diamond paste. Runs to determine the solidus were 
polished under dry conditions using synthetic polishing paper. 

Runs were analysed with a Jeol JXA8200 microprobe using minerals and syn- 
thetic oxides as standards. Results are given in Supplementary Table 2. Silicates and 
oxides were measured at 15 kV and 20 nA with a focused beam, counting times were 
20s on the peak and 10s on the background. Perovskites and carbonates were 
measured with an electron beam of 5 um diameter at 10 nA to minimize radiation 
damage. Note, however that CaSi-perovskite was still unstable under the beam, 
which results in low totals. Additionally, CaSi-perovskite often shows a blurred 
exsolution texture that is difficult to analyse without obtaining mix signals. The 
majority of the experiments, especially under subsolidus conditions (despite run 
durations up to 4 days) showed average grain sizes between 3 and 10 lm and did not 
allow a broad beam. In experiments AR13 and AR18, some phases were too small to 
be measured by electron microprobe and were identified using backscattered elec- 
tron images and EDX (Jeol JSM 6390LA scanning electron microscope). 

Metals were measured with a focused beam at 15 kV and 20 nA, counting times 
were 30s on the peak and 10s on the background. We used Fe, Ir, Niand Co metal 
as standards for quantification to reduce matrix effects. To minimize the excitation 
volume of the sample, to be able to measure small Ir-Fe alloy grains, we measured 
Ir at 15kV and 20 nA using its low excitation energy Mo peak. 

Calculation of oxygen fugacities. We calculated fo, of the experimental charges 
using Fe-FeO and Ni-NiO redox equilibria. fo, values are reported relative to iron— 
wiistite (IW) and nickel-NiO (NNO) buffers. The relative positions of IW and NNO at 
Pand T were calculated after ref. 33, the relative position of the Re-ReO, (RRO) buffer 
was estimated from ref. 36. The activity-composition relationship for (Fe,Ni) alloys 
were taken from ref. 37, activity data for (Au,Ni) alloys were adopted from ref. 38. 
Periclase-bearing experiments. Oxygen fugacities of ferropericlase (fp)-bearing 
experiments (19 and 23 GPa) were calculated using equilibria of the form 


2M + O,=2MO (5) 


with M = Fe or Ni. fo, relative to the respective metal-metal oxide (MMO) equi- 
librium reads: 


Alog fo, [MMO] = 2 log avo - 2 log aire (6) 


where activity (a) is defined as molar fraction (X) times activity coefficient (y). The 
molar fraction for metals and phases, calculated activity coefficients and calculated 
fo, values of the experiments are summarized in Supplementary Tables 3-5. 
ar was determined using a binary regular solution model: 


RTIn(yi2,,) = (11,000 + 0.011P)(1-X/2,,)? (7) 


for FeO, where P is in bar, T in K, and R is the gas constant”. We note that ferric 
iron content of ferropericlase was not measured and microprobe FeO’ data were 
used to calculate Xj. The Fe**/ZFe for ferropericlase in systems with Xp = 0.9 
however is generally <0.05 (refs 40-42), so calculated fo, values might therefore 
be overestimated by a maximum of 0.1 log units. 

arg was calculated using a ternary regular solution model for NiO: 


RTIn(y{g) = Wri-mig(Xog)” + Wri-mg(Xte)” 
+ (Wyi-mg + Wni-re - Wog-re)Xmg (8) 


where T is in K, and R is the gas constant’. 

Olivine-, wadsleyite- or ringwoodite-bearing experiments. fo, conditions rela- 
tive to the quartz—fayalite-iron (QFI) equilibrium of runs containing olivine, 
wadsleyite or ringwoodite (10, 14 and 19 GPa) were calculated using the following 
expressions: 


2Fe° + O, + SiOz = FerSiOu (9) 


Alog fo, [QFI] = 2 log atl! wad/rw 


metal 


- 2 log aps - log asio, (10) 
Activity coefficients were calculated using a binary symmetric solution model 
where: 


RTin(yf.) = Weimg(1-X8,)” (11) 


Margules parameters including W,,to correct the olivine interaction parameter 
for pressure were taken from ref. 45 (Supplementary Table 6). 
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NiO-bearing olivines show a slight negative deviation from ideal mixing, even at the 
high temperatures relevant for our experiments“**”. Owing to this we set Yio”! = 0.9; 
but the effect on calculated fo, is minor, that is, in the range +0.05 log units. 

Silica activities (asio,) in experiments containing olivine and clinoenstatite were 
calculated from the equilibrium: 


Mg,SiO4 + SiO. = MgpSirOg (12) 


where 


log asio, = log a%®* — log a®! - log K (13) 
Enstatite activity in pyroxene was calculated after ref. 48 and forsterite activity in 
olivine after ref. 45 as described above for fayalite. log K of reaction (12) was calculated 
using thermodynamic data. fo, values relative to IW are obtained by adding the 
difference in fo, between QFI and IW equilibria at run P and T to Alog fo, [QFI]. 
Activity-composition relation in the system Fe-Ir at high P and T. Iridium 
metal was used as a redox sensor™* to monitor fo, in runs that contain both ele- 
mental carbon and magnesite between 10 and 23 GPa. Margules interaction para- 
meters (W) dependent on P, T and X are calculated according to the expression 


We = Wu lbar 


TWs + (P—1)Wy (14) 
where T is in K and P in bar™. 

Wo tbar Was obtained by least-squares fitting of a binary asymmetric regular 
solution model’ to the non-temperature-dependent part of the excess Gibbs 
energy (G**) for face centred cubic (f.c.c.) Ir-Fe alloy. Wg is -5Jmol !K7! 
according to the temperature dependence of the excess Gibbs energy expression 
for f.c.c. Ir-Fe alloy. Wy is estimated by least-squares fitting of a binary asym- 
metric regular solution model to excess volumes of the Fe-Ir alloy derived from 
X-ray data*’. Activity coefficients y dependent on P and T, and X values for Fe in 
f.c.c. Ir-Fe alloy result from a binary asymmetric regular solution model"': 

RTIn(ym"!) = 2XreXieWe tere + (Xr) Wo vets - 2G" (15) 


where 


G" = XpeXq(XWe Fe-Ir at XpeWe Ir-Fe) (16) 


The effect of correcting the activity model for P and T is shown in Supplementary 
Fig. 1. 
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Mapping and analysis of chromatin state 
dynamics in nine human cell types 
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Chromatin profiling has emerged as a powerful means of genome annotation and detection of regulatory activity. The 
approach is especially well suited to the characterization of non-coding portions of the genome, which critically 
contribute to cellular phenotypes yet remain largely uncharted. Here we map nine chromatin marks across nine cell 
types to systematically characterize regulatory elements, their cell-type specificities and their functional interactions. 
Focusing on cell-type-specific patterns of promoters and enhancers, we define multicell activity profiles for chromatin 
state, gene expression, regulatory motif enrichment and regulator expression. We use correlations between these 
profiles to link enhancers to putative target genes, and predict the cell-type-specific activators and repressors that 
modulate them. The resulting annotations and regulatory predictions have implications for the interpretation of 
genome-wide association studies. Top-scoring disease single nucleotide polymorphisms are frequently positioned 
within enhancer elements specifically active in relevant cell types, and in some cases affect a motif instance for a 
predicted regulator, thus suggesting a mechanism for the association. Our study presents a general framework for 


deciphering cis-regulatory connections and their roles in disease. 


A major challenge in biology is understanding how a single genome 
can give rise to an organism comprising hundreds of distinct cell types. 
Much emphasis has been placed on the application of high-throughput 
tools to study interacting cellular components’. The field of systems 
biology has exploited dynamic gene expression patterns to reveal func- 
tional modules, pathways and networks’. Yet cis-regulatory elements, 
which may be equally dynamic, remain largely uncharted across cel- 
lular conditions. 

Chromatin profiling provides a systematic means of detecting cis- 
regulatory elements, given the central role of chromatin in mediating 
regulatory signals and controlling DNA access, and the paucity of 
recognizable sequence signals. Specific histone modifications correlate 
with regulator binding, transcriptional initiation and elongation, 
enhancer activity and repression’*°. Combinations of modifications 
can provide even more precise insight into chromatin state”*. 

Here we apply a high-throughput pipeline to map nine chromatin 
marks and input controls across nine cell types. We use recurrent 
combinations of marks to define 15 chromatin states corresponding 
to repressed, poised and active promoters, strong and weak enhancers, 
putative insulators, transcribed regions, and large-scale repressed and 
inactive domains. We use directed experiments to validate biochemical 
and functional distinctions between states. 

The resulting chromatin state maps portray a highly dynamic land- 
scape, with the specific patterns of change across cell types revealing 
strong correlations between interacting functional elements. We use 
correlated patterns of activity between chromatin state, gene expres- 
sion and regulator activity to connect enhancers to likely target genes, 
to predict cell-type-specific activators and repressors, and to identify 
individual binding motifs responsible for these interactions. 

Our results have implications for the interpretation of genome- 
wide association studies (GWASs). We find that disease variants fre- 
quently coincide with enhancer elements specific to a relevant cell 


type. In several cases, we can predict upstream regulators whose regu- 
latory motif instances are affected or target genes whose expression 
may be altered, thereby suggesting specific mechanistic hypotheses 
for how disease-associated genotypes lead to the observed disease 
phenotypes. 


Results 

Systematic mapping of chromatin marks in multiple cell types 
To explore chromatin state in a uniform way across multiple cell 
types, we applied a production pipeline for chromatin immunopre- 
cipitation followed by high-throughput sequencing (ChIP-seq) to 
generate genome-wide chromatin data sets (Methods and Fig. 1a). 
We profiled nine human cell types, including common lines desig- 
nated by the ENCODE consortium! and primary cell types. These 
consist of embryonic stem cells (H1 ES), erythrocytic leukaemia cells 
(K562), B-lymphoblastoid cells (GM12878), hepatocellular carcin- 
oma cells (HepG2), umbilical vein endothelial cells (HUVEC), skel- 
etal muscle myoblasts (HSMM), normal lung fibroblasts (NHLF), 
normal epidermal keratinocytes (NHEK) and mammary epithelial 
cells (HMEC). 

We used antibodies for histone H3 lysine 4 trimethylation 
(H3K4me3), a modification associated with promoters**’; H3K4me2 
(dimethylation), associated with promoters and enhancers’*®’; 
H3K4mel (methylation), preferentially associated with enhancers"; 
lysine 9 acetylation (H3K9ac) and H3K27ac, associated with active 
regulatory regions”’°; H3K36me3 and H4K20mel, associated with 
transcribed regions’; H3K27me3, associated with Polycomb- 
repressed regions**; and CTCF, a sequence-specific insulator protein 
with diverse functions’’. We validated each antibody by western blots 
and peptide competitions, and sequenced input controls for each cell 
type. We also collected data for H3K9me3, RNA polymerase II 
(RNAPII) and H2A.Z (also known as H2AFZ) in a subset of cells. 
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Figure 1 | Chromatin state discovery and characterization. a, Top: profiles 
for nine chromatin marks (greyscale) are shown across the WLS gene in four 
cell types, and summarized in a single chromatin state annotation track for each 
(coloured according to b). WLS is poised in ESCs, repressed in GM12878 and 
transcribed in HUVEC and NHLF. Its TSS switches accordingly between 
poised (purple), repressed (grey) and active (red) promoter states; enhancer 
regions within the gene body become activated (orange, yellow); and its gene 
body changes from low signal (white) to transcribed (green). These chromatin 
state changes summarize coordinated changes in many chromatin marks; for 
example, H3K27me3, H3K4me3 and H3K4me2 jointly mark a poised 
promoter, whereas loss of H3K27me3 and gain of H3K27ac and H3K9ac mark 
promoter activation. WCE, whole-cell extract. Bottom: nine chromatin state 
tracks, one per cell type, in a 900-kb region centred at WLS, summarizing 90 
chromatin tracks in directly interpretable dynamic annotations and showing 
activation and repression patterns for six genes and hundreds of regulatory 
regions, including enhancer states. b, Chromatin states learned jointly across 


This resulted in 90 chromatin maps corresponding to 
~2,400,000,000 reads covering ~100,000,000,000 bases across nine 
cell types, which we set out to interpret computationally. 


Learning a common set of chromatin states across cell types 
To summarize these data sets into nine readily interpretable annota- 
tions, one per cell type, we applied a multivariate hidden Markov 
model that uses combinatorial patterns of chromatin marks to distin- 
guish chromatin states*. The approach explicitly models mark com- 
binations in a set of ‘emission’ parameters and spatial relationships 
between neighbouring genomic segments in a set of ‘transition’ para- 
meters (Methods). It has the advantage of capturing regulatory ele- 
ments with greater reliability, robustness and precision than is 
possible by studying individual marks’. 

We learned chromatin states jointly by creating a virtual conca- 
tenation of all chromosomes from all cell types. We selected 15 states 
that showed distinct biological enrichments and were consistently 
recovered (Fig. 1a, b and Supplementary Fig. 1). Even though states 
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cell types by a multivariate hidden Markov model. The table shows emission 
parameters learned de novo on the basis of genome-wide recurrent 
combinations of chromatin marks. Each entry denotes the frequency with 
which a given mark is found at genomic positions corresponding to the 
chromatin state. c, Genome coverage, functional enrichments and candidate 
annotations for each chromatin state. Blue shading indicates intensity, scaled 
by column. CNV, copy number variation; GM, GM12878. d, Box plots 
depicting enhancer activity for predicted regulatory elements. Sequences 

250 bp long corresponding either to strong or weak/poised HepG2 enhancer 
elements or to GM12878-specific strong enhancer elements were inserted 
upstream ofa luciferase gene and transfected into HepG2. Reporter activity was 
measured in relative light units. Robust activity is seen for strong enhancers in 
the matched cell type, but not for weak/poised enhancers or for strong 
enhancers specific to a different cell type. Boxes indicate 25th, 50th and 75th 
percentiles, and whiskers indicate 5th and 95th percentiles. 


were learned de novo solely on the basis of the patterns of chromatin 
marks and their spatial relationships, they showed distinct associa- 
tions with transcriptional start sites (TSSs), transcripts, evolutionarily 
conserved non-coding regions, DNase hypersensitive sites'’*, binding 
sites for the regulators c-Myc’? (MYC) and NF-«B”™, and inactive 
genomic regions associated with the nuclear lamina’ (Fig. 1c). 

We distinguished six broad classes of chromatin states, which we 
refer to as promoter, enhancer, insulator, transcribed, repressed and 
inactive states (Fig. 1c). Within them, active, weak and poised* promo- 
ters (states 1-3) differ in expression level, strong and weak candidate 
enhancers (states 4-7) differ in expression of proximal genes, and 
strongly and weakly transcribed regions (states 9-11) also differ in 
their positional enrichments along transcripts. Similarly, Polycomb- 
repressed regions (state 12) differ from heterochromatic and repetitive 
states (states 13-15), which are also enriched for H3K9me3 (Sup- 
plementary Figs 2-4). 

The states vary widely in their average segment length (~500 base 
pairs (bp) for promoter and enhancer states versus 10 kb for inactive 
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regions) and in the portion of the genome covered (<1% for promoter 
and enhancer states versus >70% for inactive state 13). For each state, 
coverage was relatively stable across cell types (Supplementary Fig. 5), 
with the exception of embryonic stem cells (ESCs) in which the poised 
promoter state is more abundant but strong enhancer and Polycomb- 
repressed states are depleted, consistent with the unique biology of 
pluripotent cells*’®. 

We confirmed that promoter and enhancer states showed distinct 
biochemical properties (Supplementary Fig. 6). RNAPII was highly 
enriched at strong promoters, weakly enriched at strong enhancers 
and nearly undetectable at weak or poised enhancers, consistent with 
strong transcription at promoters and reports of weak transcription at 
active enhancers!”'*, H2A.Z, a histone variant associated with nucleo- 
some-free regions’’, was enriched in active promoters and strong 
enhancers, consistent with nucleosome displacement at TSSs and sites 
of abundant transcription factor binding in active enhancers. 

We also used luciferase reporter assays to validate the functionality 
of predicted enhancers, the distinction between strong and weak 
enhancer states, and their predicted cell type specificity. We tested 
strong enhancers, weak enhancers and strong enhancers specific to an 
unmatched cell type by transfection in HepG2. We observed strong 
luciferase activity only for strong enhancer elements from the 
matched cell type (Fig. 1d). 

These results and additional properties of the model (Supplemen- 
tary Figs 7-10) suggest that chromatin states are an inherent, bio- 
logically informative feature of the genome. The framework enables 
us to reason about coordinated differences in marks by directly study- 
ing chromatin state changes between cell types (which we refer to as 
‘changes’ or ‘dynamics’ without implying any temporal relationship). 


Extent and significance of chromatin state changes across cell 
types 

We next explored the extent to which chromatin states vary between 
pairs of cell types. The overall patterns of variability (Supplementary 
Figs 11 and 12) suggest that regulatory regions vary drastically in activity 
level across cell types. Enhancer states show frequent interchange 
between strong and weak, and promoter states vary between active, 
weak and poised. Promoter states seem more stable than enhancers; 
they are eight times more likely to remain promoter states, controlling 
for coverage. Switching was also observed among promoter, enhancer 
and transcriptional transition states, but no preferential changes to other 
groups were found. These general patterns suggest that despite varying 
activity levels, enhancer and promoter regions tend to preserve their 
chromatin identity as regions of regulatory potential. 

Chromatin state differences between cell types relate to cell-type- 
specific gene functions. An unbiased clustering of chromatin state 
profiles across annotated TSSs in lymphoblastoid and skeletal muscle 
cells distinguished informative patterns predictive of downstream gene 
expression and functional gene classes (Supplementary Figs 13 and 
14). Cell-type-specific patterns were also evident when TSSs were sim- 
ply assigned to the most prevalent chromatin state. Promoters active in 
skeletal muscle were associated with extracellular structure genes (8.5- 
fold enrichment), those active in lymphoblastoid cells were associated 
with immune response genes (7.2-fold enrichment) and those active in 
both were associated with metabolic housekeeping genes. 


Clustering of promoter and enhancer states on the basis of their 
activity patterns 

Extending our pairwise promoter analysis, we clustered active promoter 
and strong enhancer regions across all cell types (Methods). This 
revealed clusters showing common activity and associated with highly 
coherent functions (Fig. 2). For promoter clusters, these include 
immune response (GM12878-specific clusters, P< 10 '%), cholesterol 
transport (HepG2 specific, P< 10 *) and metabolic processes (all cells, 
P<10 ‘*), Remarkably, genes assigned to enhancer clusters by proxi- 
mity also showed strong functional enrichments, including immune 
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Figure 2 | Cell-type-specific promoter and enhancer states and associated 
functional enrichments. a, Clustering of genomic locations (rows) assigned to 
active promoter state 1 (red) across cell types (columns) reveals 20 common 
patterns of activity (A-T; Methods). For each cluster, enriched gene ontology 
terms are shown with hypergeometric P value and fold enrichment, based on 
the nearest TSS. For most clusters, several cell types show strong (dark red) or 
moderate (light red) activity. b, Analogous clustering and functional 
enrichments for strong enhancer state 4 (yellow). Enhancer states show greater 
cell type specificity, with most clusters active in only one cell type. 


response (GM12878 specific, P< 10°), lipid metabolism (HepG2 spe- 
cific, P< 10 °) and angiogenesis (HUVEC specific, P< 10°). 

Promoters and enhancers differed in their overall specificities. The 
majority of promoter clusters showed activity in multiple cell types, con- 
sistent with previous work*”° (Fig. 2a). Enhancer clusters are significantly 
more cell type specific, with few regions showing activity in more than two 
cell types and a majority being specific to a single cell type (Fig. 2b). 

Wealso found differences in the relative contributions of enhancer- 
based and promoter-based regulation among gene classes. Develop- 
mental genes seem to be strongly regulated by both, showing the 
highest number of proximal enhancers and diverse promoter states, 
including poised and Polycomb repressed (Supplementary Fig. 15). 
Tissue-specific genes (for example immune genes and steroid meta- 
bolism genes) seem to be more dependent on enhancer regulation, 
showing multiple tissue-specific enhancers but less diverse promoter 
states. Lastly, housekeeping genes are primarily promoter regulated, 
with few enhancers in their vicinities. 

Overall, this dynamic view of the chromatin landscape suggests that 
multicell chromatin profiles can be as productive for systems biology as 
expression analysis has traditionally been, and may hold additional 
information on genome regulatory programs, which we explore next. 


Correlations in activity profiles link enhancers to target genes 
We next investigated functional interconnections among enhancers, 
the factors that activate or repress them, and the genes whose expres- 
sion they regulate, by defining ‘activity profiles’ for each across the cell 
types (Fig. 3). We complemented these enhancer activity profiles 
(Fig. 3a) with profiles for gene expression (Fig. 3b), sequence motif 
enrichment (Fig. 3d) and the expression of transcription factors 
recognizing each motif (Fig. 3e). We used correlations between these 
profiles to probabilistically link enhancers to their downstream targets 
and upstream regulators (Methods). 

We found that patterns of enhancer activity (Figs 2b and 3a) cor- 
related strongly with patterns of nearest-gene expression (Fig. 3b; 
correlation, >0.9 in 16 of 20 clusters). Because this correlation 
remained high even for large distances (>50 kb), we used activity 
correlation as a complement to genomic distance for linking enhan- 
cers to target genes (Methods). Activity-based linking yielded an 
increase in functional gene class enrichment for several clusters 
(Supplementary Fig. 16). 

We validated our approach using quantitative trait locus mapping 
studies that use covariation between single nucleotide polymorphism 
(SNP) alleles and gene expression levels to link cis-regulatory regions 
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Figure 3 | Correlations in activity patterns link enhancers to gene targets 
and upstream regulators. a, Average enhancer activity across the cell types 
(columns) for each enhancer cluster (rows) defined in Fig. 2b (labelled A-T) and 
number of 200-bp windows in each cluster. b, Average messenger RNA 
expression of nearest gene across the cell types and correlation with enhancer 
activity profile from a. High correlations between enhancer activity and gene 
expression provide a means of linking enhancers to target genes. c, Enrichment 
for Oct4 binding in ESCs” and NF-«B binding in lymphoblastoid cells" for each 
cluster. TF, transcription factor. d, Strongly enriched (red) or depleted (blue) 
motifs for each cluster, from a catalogue of 323 consensus motifs. Rfx: Rfx family; 
Nrf-2: NFE2L2; STAT: STAT family; Ets: Ets family; Mef2: MEF2A and MYEF2; 


to target genes. Investigation of four recent quantitative trait locus 
studies in liver’? and lymphoblastoid cells*’*’ revealed remarkable 
agreement with our enhancer predictions. Enhancers linked to a given 
target gene by our method were significantly enriched for SNPs cor- 
related with the gene’s expression level (Supplementary Fig. 17), thus 
confirming our enhancer-gene linkages with orthogonal data. 


Correlations with transcription factor expression and motif 
enrichment predict upstream regulators 

We next predicted, on the basis of regulatory motif enrichments, 
sequence-specific transcription factors likely to target enhancers in 
a given cluster. This implicated a number of transcription factors 
whose known biological roles matched the respective cell types 
(Fig. 3d and Supplementary Fig. 18). When ChIP-seq data on the 
relevant cell type was available, we confirmed that enriched motifs 
were preferentially bound by the cognate factor (Fig. 3c). Oct4 
(POUS5F1) motif instances in cluster A (ESC-specific enhancers) were 
preferentially bound by Oct4 in ESCs™, and NF-«B motif instances in 
cluster F (lymphoblastoid-specific enhancers) were preferentially 
bound by NF-«B in lymphoblastoid cells'*. In both cases, motif 
instances in cell-type-specific enhancers showed a ~5-fold increase 
in binding in comparison with other enhancers. 

However, sequence-based motif enrichments do not distinguish 
causality. Enrichment could reflect a parallel binding event that does 
not affect the chromatin state, or the motif could actually be antagonistic 
to the enhancer state through specific repression in orthogonal cell 
types. To distinguish between these possibilities, we complemented 
the observed motif enrichments with cell-type-specific expression for 
the corresponding transcription factors (Fig. 3e). We then correlated 
a ‘motif score’ based on motif enrichment in a given cluster, and a 
‘transcription factor expression score’ based on the agreement between 
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Myf: Myf family; NF-Y: NFYA, NFYB and NFYC. e, Predicted causal regulators 
for each cluster based on positive (activators) or negative (repressors) 
correlations between motif enrichment (top left triangles) and transcription 
factor expression (bottom right triangles). For example, the red-yellow 
combination indicates that Oct4 is a positive regulator of ESC-specific 
enhancers, as its motif-based predicted targets are enriched (red upper triangle) 
for enhancers active in ESCs (cluster A), and the Oct4 gene is expressed 
specifically in ESCs, resulting in a positive transcription factor expression 
correlation (yellow triangle). Overall correlations between motif enrichment and 
transcription factor expression across all clusters denote predicted activators 
(positive correlation, orange) and repressors (negative correlation, purple). 


the transcription factor expression pattern and the cluster activity pro- 
file (Methods). A positive correlation between the two scores implies 
that the transcription factor may be establishing or reinforcing the 
chromatin state. A negative correlation would instead imply that the 
transcription factor may act as a repressor. For example, in addition to 
the enrichment of the Oct4 motif in the ESC-specific cluster A, Oct4 is 
specifically expressed in ESCs, leading to the prediction that it is a causal 
regulator of ESCs (Fig. 3e), consistent with known biology’®. 

For 18 of the 20 clusters, this analysis revealed one or more can- 
didate regulators. Recovery of known roles for well-studied regulators 
validated our approach. For example, HNF1 (HNFI1A), HNF4 
(HNF4A) and PPARy (PPARG) are predicted as activators of 
HepG2-specific enhancers (clusters H and I), PU.1 (SPI1) and NF- 
«B as activators of lymphoblastoid (GM12878) enhancers (clusters C, 
F and G), GATA1 as an activator of K562-specific enhancers (cluster 
B) and Myf family members as HSMM enhancers'***”’ (cluster O). 

The analysis also revealed potentially novel regulatory interactions. 
ETS-related factors (ELK1, TEL2 (ETV7) and Ets family members) 
are predicted activators of enhancers active in both GM12878 and 
HUVEC (cluster G) but not of GM12878-specific or HUVEC-specific 
clusters, emphasizing the value of unbiased clustering. These connec- 
tions are consistent with reported roles for ETS factors in lympho- 
poiesis and endothelium”. The prediction of p53 (TP53) as an 
activator in HSMM, NHLF, NHEK and HMEC (clusters N, Q and 
R) probably reflects its maintained activity in these primary cells, as 
opposed to cell models in which it may be suppressed by mutation 
(K562)”, viral inactivation (GM12878)*° or cytoplasmic localization 
(ESCs)*". A widespread role for p53 in regulating distal elements is 
consistent with its known binding to distal regions’. 

Our analysis also revealed several repressor signatures, including 
GFI1 in K562 and GM12878 (clusters B and C) and BACH2 in ESCs 
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(cluster A). Both regulators are known to repress transcription by 
recruiting histone deacetylases and methyltransferases to proximal 
promoters****, and GFI1 has also been implicated in the silencing of 
satellite repeats**. Our regulatory inferences suggest that these regu- 
lators also modulate chromatin to inhibit enhancer activity, thus sug- 
gesting a new mechanism for distal gene regulation. 


Validation of predicted binding events and regulatory out- 
comes 

The regulatory inferences above imply transcription-factor-binding 
events at motif instances within enhancer regions in specific cellular 
contexts, and we sought to validate these inferences using a general 
molecular signature. Binding events are associated with nucleosome 
displacement, a structural change evident in ChIP-seq data for his- 
tones**. We thus studied local depletions in the chromatin intensity 
profiles (‘dips’) as these are indicative of transcription factor binding. 
We confirmed that dips were present in individual signal tracks for 
active enhancers and were associated with preferential sequence con- 
servation and regulatory motif instances (Fig. 4a). 

To test our specific predictions, we superimposed chromatin pro- 
files of coordinately regulated enhancer regions, anchoring them on 
the implied motif instances. Striking dips precisely coincide with 
regulatory motifs, and are both cell type specific and region specific, 
exactly as predicted (Fig. 4b, c). Because dips only appear when the 
factor is expressed, they also support the identity of the trans-acting 
transcription factor. 

To confirm that predicted causal motifs contribute to enhancer 
activity, we used luciferase reporters. Our model implicated HNF 
regulators as activators of HepG2-specific enhancers (Fig. 3), and 
context-specific dips supported binding interactions (Fig. 4c). We 
thus selected for functional analysis ten sites with HNF motifs show- 
ing dips in strong HepG2-specific enhancers, and evaluated them 
with and without the HNF motif. We found that permutation of 
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Figure 4 | Validation of regulatory predictions by nucleosome depletions 
and enhancer activity. a, Dips in chromatin intensity profiles in a K562- 
specific strong enhancer (orange) coincide with a predicted causal GATA motif 
instance (logo). The dips probably reflect nucleosome displacement associated 
with transcription factor binding, supported by DNase hypersensitivity’? and 
GATAI binding”. b, Superposition of H3K27ac signal across loci containing 
GATA motifs, centred on motif instances, shows dips in K562, as predicted. 
c, Superposition of H3K4me2 signal for HepG2 shows dips over HNF4 motifs 
in strong enhancer states, as predicted. d, HepG2-specific strong enhancers 
with predicted causal HNF motifs were tested in reporter assays. Constructs 
with permuted HNF motifs (red) led to significantly reduced luciferase activity 
in comparison with wild type (blue), with an average twofold reduction. Data 
shown are mean luciferase relative light units over three replicates and 95% 
confidence intervals. 
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the motif consistently led to a reduction in enhancer activity 
(Fig. 4d), supporting its predicted causal role. 


Assigning candidate regulatory functions to disease-associated 
variants 

Finally, we explored whether our chromatin annotations and regula- 
tory predictions can provide insight into sequence variants associated 
with disease phenotypes. To that effect, we gathered a large set of non- 
coding SNPs from GWAS catalogues, an exceedingly small propor- 
tion of which are understood at present’”. 

We found that disease-associated SNPs are significantly more likely 
to coincide with strong enhancers (states 4 and 5; twofold enrichment, 
P< 10 '°), despite the fact that no notable association with these 
states are seen for SNPs in general or for those SNPs tested in the 
studies. To test whether SNPs associated with a particular disease 
might have even more specific correspondences, we examined 426 
GWAS data sets. We identified ten studies**” whose variants showed 
significant correspondences to cell-type-specific strong enhancer 
states (Methods and Fig. 5a). 

Individual variants from these studies were strongly enriched in 
enhancer states specifically active in relevant cell types (Fig. 5a, b). 
For example, SNPs associated with erythrocyte phenotypes*® were 
found in erythrocytic leukaemia cell (K562) enhancers, SNPs asso- 
ciated with systemic lupus erythematosus” were found in lymphoblas- 
toid cell (GM12878) enhancers and SNPs associated with triglyceride” 
phenotypes or blood lipid phenotypes were found in hepatocellular 
carcinoma cell (HepG2) enhancers. We also applied our model to 
chromatin data for T cells’ (Supplementary Fig. 19), for which strong 
enhancer states correlated to variants associated with risk of childhood 
acute lymphoblastic leukaemia“, further validating our approach. 

We also used our predicted enhancer/target gene associations to 
find candidate downstream genes whose expression might be affected 
by cis changes occurring in the enhancer region. Although most of the 
predicted target genes are proximal to the enhancer, a subset of more 
distal predicted targets could reflect novel candidates for the disease 
phenotypes (Fig. 5b). 

In addition, we identified several instances in which a lead GWAS 
variant does not correspond to a particular chromatin element but a 
linked variant coincides with an enhancer with the predicted cell type 
specificity (Fig. 5c). Thus, chromatin profiles may provide a general 
means of triaging variants within a haplotype block, a common problem 
faced in GWASs. 

Lastly, we identified several cases in which a disease-associated SNP 
created or disrupted a regulatory motif instance for a predicted causal 
transcription factor in the relevant cell type (Fig. 5d), suggesting a 
specific molecular mechanism by which the disease-associated geno- 
type could lead to the observed disease phenotype consistent with our 
regulatory predictions. 


Discussion 


Our work demonstrates the power of multicell chromatin profiles as 
an additional and dynamic layer of genome annotation. We presented 
methods to distinguish different classes of functional elements, elu- 
cidate their cell type specificities and predict cis-regulatory interac- 
tions that drive gene expression programs. By intersecting our 
predictions with non-coding SNPs from GWAS data sets, we pro- 
posed potential mechanistic explanations for disease variants, either 
through their presence within cell-type-specific enhancer states or by 
their effect on binding motifs for predicted regulators. 

Chromatin states drastically reduced the large combinatorial space 
of 90 chromatin data sets (2°” combinations) to a manageable set of 
biologically interpretable annotations, thus providing an efficient and 
robust way to track coordinated changes across cell types. This 
allowed the systematic identification and comparison of more than 
100,000 promoter and enhancer elements. Both types of element are 
cell type specific, are associated with motif enrichments and assume 
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strong, weak and poised states that correlate with neighbouring gene 
expression and function. Enhancers showed very high tissue specifi- 
city, enrichment in the vicinity of developmental and cell-type- 
specific genes, and predictive power for proximal gene expression, 
reinforcing their roles as sentinels of tissue-specific gene expression”. 
By elucidating enhancers systematically, and linking them to 
upstream regulators and downstream genes, our analysis can help 
provide a missing link between regulators and target genes. The power 
of the approach should increase considerably as additional pheno- 
typically distinct cell types are surveyed, and should enable a greater 
proportion of enhancer elements to be incorporated into the connec- 
tivity network. 

The inferred cis-regulatory interactions make specific testable pre- 
dictions, many of which were confirmed through additional experi- 
ments and analyses. Our enhancer/target gene linkages are supported 
by cis-regulatory inferences from quantitative trait locus mapping 
studies. Predicted transcription factor/motif interactions within 
cell-type-specific enhancers were confirmed in specific cases by tran- 
scription factor binding and more generally by depletions in the chro- 
matin profiles at causal motifs in appropriate cellular contexts. Motifs 
predicted as causal regulators of cell-type-specific enhancers were also 
confirmed in enhancer assays. 
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The regulatory inferences afforded by multicell chromatin profiles 
are unique and highly complementary to data sets for transcription 
factor binding, expression, chromatin accessibility, nucleosome 
positioning and chromosome conformation”. For example, our regu- 
latory predictions can help focus the spectrum of transcription- 
factor-binding events to a smaller number of functional interactions. 
The ‘chromatin-centric’ approach also complements the extensive 
body of work on biological network inference from expression data 
with the potential to introduce enhancers and other genomic elements 
into connectivity networks. 

Our study has important implications for the understanding of 
disease. Our detailed and dynamic functional annotations of the 
relatively uncharted non-coding genome can facilitate the interpreta- 
tion of GWAS data sets by predicting specific cell types and regulators 
related to specific diseases and phenotypes. Furthermore, the connec- 
tions derived for enhancer regions, to upstream regulators and down- 
stream genes, suggest cis- and trans-acting interactions that may be 
modulated by the sequence variants. Although the present study 
represents only a first, small step in this direction, we expect that 
future iterations with a greater diversity of cell types and improved 
methodologies will help define the molecular underpinnings of human 
disease. 
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METHODS SUMMARY 


We performed ChIP-seq analysis in biological replicate as previously described’, 
using antibodies validated by western blots and peptide competitions. ChIP DNA 
and input controls were sequenced using the lumina Genome Analyser. Expression 
profiles were acquired using Affymetrix GeneChip arrays. Chromatin states were 
learned jointly by applying a hidden Markov model to ten data tracks for each of the 
nine cell types. We focused on a 15-state model that provides sufficient resolution to 
resolve biologically meaningful patterns yet is reproducible across cell types when 
independently processed. We used this model to produce nine genome-wide chro- 
matin state annotations, which were validated by additional ChIP experiments and 
reporter assays. Multicell type clustering was conducted on locations assigned to 
strong promoter state 1 (or strong enhancer state 4) in at least one cell type using the 
k-means algorithm. We predicted enhancer/target gene linkages by correlating 
normalized signal intensities of H3K27ac, H3K4mel and H3K4me2 with gene 
expression across cell types as a function of distance to the TSS. Upstream regulators 
were predicted using a set of known transcription factor motifs assembled from 
multiple sources. Motif instances were identified by sequence match and evolution- 
ary conservation. We based P values for GWAS studies on randomizing the location 
of SNPs, and based the false-discovery rate on randomizing the assignment of SNPs 
across studies. Data sets are available from the ENCODE website (http://genome. 
ucsc.edu/ENCODE), the supporting website for this paper (http://compbio.mit.edu/ 
ENCODE. _chromatin_states) and the Gene Expression Omnibus (GSE26386). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Cell culture. Human H1 ES cells were cultured in TeSR media” on Matrigel by 
Cellular Dynamics International. Cells were split with dispase and collected at a 
passage number between 30 and 40. Before collection, cells were karyotyped and 
stained for Oct4 to confirm pluripotency. K562 erythrocytic leukaemia cells 
(ATCC CCL-243, lot no. 4607240) were grown in suspension in RPMI medium 
(HyClone SH30022.02) with 10% fetal bovine serum (FBS) and 1% Antibiotic- 
Antimycotic (GIBCO 15240-062). Cell density was maintained at between 
3 X 10° and 7 X 10° cellsml'. GM12878 B-lymphoblastoid cells (Coriell Cell 
Repositories, ‘expansion A’) were grown in suspension in RPMI 1640 medium 
with 15% FBS (not heat inactivated), 2 mM L-glutamine and 1% penicillin/strep- 
tomycin. Cells were seeded at a concentration of ~2 10° viable cells ml’ with 
minimal disruption, and maintained at between 3 X 10° and 7 X 10° cells ml~?. 
HepGz2 hepatocellular carcinoma cells (ATCC HB-8065, lot no. 4968519) were 
cultured in DMEM (HyClone SH30022.02) with 10% FBS and 1% penicillin/ 
streptomycin. Cells were trypsinized, resuspended to single-cell suspension, split 
to a confluence of between 15 and 20% and then collected at ~75% confluence. 
NHEK normal human epidermal keratinocytes isolated from skin (Lonza CC- 
2501, lot no. 4F1155J, passage 1) were grown in keratinocyte basal medium 2 
(KGM-2 BulletKit, Lonza) supplemented with BPE, hEGF, hydrocortisone, GA- 
1000, transferrin, epinephrine and insulin. Cells were seeded at the recommended 
density (3,500 cellscm *), subjected to two or three passages on polystyrene 
tissue culture plates and collected at a confluence of 70 to 80%. HSMM primary 
human skeletal muscle myoblasts (Lonza CC-2580, lot no. 6F4444, passage 2) 
were cultured in Smooth Muscle Growth Medium 2 (SkGM-2 BulletKit, Lonza) 
supplemented with rhEGF, dexamethasone, 1-glutamine, FBS and GA-1000. 
Cells were seeded at the recommended density (3,500 cells cm’), subjected to 
two or three passages and collected at a confluence of 50 to 70%. NHLF primary 
normal human lung fibroblasts (Lonza CC-2512, lot no. 4F0758, passage 2) were 
grown in Fibroblast Cell Basal Medium 2 (FGM-2 BulletKit, Lonza) supplemented 
with hFGF-f, insulin, FBS and GA-100. Cells were seeded at the recommended 
density (2,500 cells cm *), subjected to two or three passages and collected at an 
approximate confluence of 80%. HUVEC primary human umbilical vein endothe- 
lial cells (Lonza CC-2517, lot no. 7F3239, passage 1) were grown in endothelial 
basal medium 2 (EGM-2 BulletKit, Lonza) supplemented with hFGF-, hydro- 
cortisone, VEGF, R3-IGF-1, ascorbic acid, heparin, FBS, hEGF and GA-1000. Cells 
were seeded at the recommended density (2,500-5,000 cells cm” *), subjected to 
two or three passages and collected at a confluence of 70 to 80%. HMEC primary 
human mammary epithelial cells from mammary reduction tissue (Lonza CC- 
2551, passage 7) were grown in mammary epithelia basal medium (MEGM 
BulletKit, Lonza) supplemented with hEGF-f, hydrocortisone, BPE, GA-1000 
and insulin. Cells were seeded at the recommended density (2,500 cells cm ”), 
subjected to two or three passages and collected at 60 to 80% confluence. 
Antibodies. ChIP assays were performed using the following antibody reagents: 
H3K4mel (Abcam ab8895, lot 38311/659352), H3K4me2 (Abcam ab7766, lot 
56293), H3K4me3 (Abcam ab8580, lot 331024; Milipore 04-473, lot 
DAM1623866), H3K9ac (Abcam ab44441, lot 455103/550799), H3K27ac 
(Abcam ab4729, lot 31456), H3K36me3 (Abcam ab9050, lot 136353), 
H4K20mel (Abcam ab9051, lot 104513/519198), H3K27me3 (Millipore 07- 
449, lot DAM1387952/DAM1514011), CTCF (Millipore 07-729, lot 1350637), 
H3K9me3 (Abcam ab8898, lot 484088), H2A.Z (Millipore 07-594, lot 
DAM1504736) and RNAPII N terminus (Santa Cruz sc-899X, lot H0510). All 
antibody lots were extensively validated for specificity and efficacy in ChIP-seq. 
Western blots were used to confirm specific recognition of histone protein (or 
CTCE). Dot plots performed using arrayed histone tail peptides representing 
various modification states were used to confirm specificity for the appropriate 
modification. ChIP-seq assays performed on a common cell reagent were used to 
confirm consistency between different lots of the same antibody. 

Chromatin immunoprecipitation. Cells were harvested by crosslinking with 1% 
formaldehyde in cell culture medium for 10 min at 37°C. After quenching with 
the addition of 125 mM glycine for 5 min at 37 °C, the cells were washed twice 
with cold PBS containing protease inhibitor (Roche). After aspiration of all liquid, 
pellets consisting of ~10’ cells were flash frozen and stored at —80 °C. Fixed cells 
were thawed and sonicated to obtain chromatin fragments of ~200 to 700 bp 
using a Bioruptor (Diagenode). Immunoprecipitation was performed as previ- 
ously described, retaining a fraction of input ‘whole-cell extract’ as a control’. 
Briefly, sonicated chromatin was diluted tenfold and incubated with ~5 pg 
antibody overnight. Antibody-chromatin complexes were pulled-down using 
protein A sepharose, washed and then eluted. After crosslink reversal and 
proteinase K treatment, immunoprecipitated DNA was extracted with phenol, 
precipitated in ethanol and treated with RNase. ChIP DNA was quantified by 
fluorometry using the Qubit assay (Invitrogen). 


Next-generation sequencing. For each ChIP or control sample, ~5 ng of DNA 
was used to generate a standard Illumina sequencing library. Briefly, DNA frag- 
ments were end-repaired using the End-It DNA End-Repair Kit (Epicentre), 
extended with a 3’ ‘A’ base using Klenow (3' > 5’ exo-, 0.3 U ul~', NEB), ligated 
to standard Illumina adapters (75 bp with a “I’ overhang) using DNA ligase 
(0.05 Upl ', NEB), gel-purified on 2% agarose, retaining products between 
275 and 700 bp, and subjected to 18 PCR cycles. These libraries were quantified 
by fluorometry and evaluated by quantitative PCR or a multiplexed-digital- 
hybridization-based analysis** (NanoString nCounter) to confirm representation 
and specific enrichment of DNA species. Libraries were sequenced in one or two 
lanes on the Illumina Genome Analyser using standard procedures for cluster 
amplification and sequencing by synthesis. 

Expression profiling. Cytosolic RNA was isolated using RNeasy Columns 
(Qiagen) from the same cell lots as above. Gene expression profiles were acquired 
using Affymetrix GeneChip arrays. The data were normalized using the 
GenePattern expression data analysis package™. CEL files were processed by 
RMA, quantile normalization and background correction. Two replicate expres- 
sion data sets for each cell type were averaged and log-transformed. Gene-level 
normalization across cell types was computed by mean normalization. 

Primary processing of sequencing reads. ChIP-seq reads were aligned to human 
genome build HG18 with MAQ (http://maq.sourceforge.net/maq-man.shtml) 
using default parameters. All reads were truncated to 36 bases before alignment. 
Signal density maps for visualization were derived by extending sequencing reads 
by 200 bp in the 3’ direction (the estimated median size of ChIP fragments), and 
then counting the total number of overlapping reads at 25-bp intervals. Replicate 
ChIP-seq experiments were verified by comparing enriched intervals as previ- 
ously described’, and were then combined into a single data set. For the hidden 
Markov model (HMM), density maps were derived by extending sequencing 
reads by 200 bp in the 3’ direction and then assigning them to a single 200-bp 
window based on the midpoint of the extended read. These maps were then 
binarized at 200-bp resolution on the basis of a Poisson background model using 
a threshold of 10° *. 

Joint learning of HMM states across cell types. To handle data from the nine cell 
types, we concatenated their genomes to create an extended virtual genome that 
we used to train the HMM. We applied the model to ten tracks corresponding to 
the different chromatin marks and input using a multivariate HMM as previously 
described®. Here we used a Euclidean distance for determining initial parameters 
for the nested initialization step. After the HMM had learned and evaluated a set 
of roughly nested models, considering up to 25 states, we focused on a 15-state 
model that provides sufficient resolution to resolve biologically meaningful chro- 
matin patterns and yet is highly reproducible across cell types when indepen- 
dently processed (Supplementary Fig. 7). We used this model to compute the 
probability that each location is in a given state, and then assigned each 200-bp 
interval to its most likely state for each cell type. Even though our model focuses 
on presence/absence frequencies of marks, we found that our states also capture 
signal intensity differences between high-frequency and low-frequency marks 
(Supplementary Fig. 9). 

Enrichment analysis. For each state, enrichments for different annotations were 
computed at 200-bp resolution with the exception of conservation, which was 
computed at nucleotide resolution. We used annotations obtained through the 
UCSC Genome Browser™ for RefSeq TSSs and transcribed regions, PhastCons”*, 
DNase-seq for K562 cells’*, c-Myc ChIP-seq for K562 cells'*, NF-kB ChIP-seq for 
GM12878", Oct4 in ESCs™ and nuclear lamina’*. Gene functional group enrich- 
ments were determined using STEM” and biological process annotations in the 
Gene Ontology database**. P values were calculated on the basis of the hypergeo- 
metric distribution and corrected for multiple testing using Bonferroni correction. 
Comparing chromatin state assignments between cell types. For each pair of 
cell types, the chromatin state assignments at each genomic position were com- 
pared. We calculated the frequency with which each pair of states occurred, and 
normalized this against the expected frequency based on the amount of genome 
covered by each state. The fold enrichments in Fig. 2a reflect an aggregation 
across all 72 possible pairs of cell types. 

Pairwise promoter clustering. Promoters for RefSeq genes were clustered on the 
basis of the most likely chromatin state assignment across a 2-kb region centred 
on the TSS. Clustering was performed jointly across GM12878 and HSMM, and 
was restricted to genes with corresponding Affymetrix expression. Briefly, each 
promoter was treated as a 330-element binary vector in which each component 
corresponded to a position along the promoter, cell type and state. Clustering was 
performed on these vectors using the k-means algorithm in MATLAB. Gene 
expression values were calculated on the basis of the corresponding Affymetrix 
probe set closest to the TSS. 

Multicell type promoter and enhancer clustering. Promoter state clustering was 
performed for all 200-bp intervals assigned to the strong promoter state (state 1) 
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in at least one cell type. Each interval was represented by a single vector whose 
components are the estimated probabilities that it be in the strong promoter state 
for each of the nine cell types, accounting for model assignment uncertainty and 
biological noise. These were determined from the model posterior probabilities of 
state assignments and a comparison of state assignments in replicate experi- 
mental data. Clustering was performed using the k-means algorithm in 
MATLAB. We found that 20 clusters provided sufficient resolution to distinguish 
major cell-type-specific patterns. Enhancer state clustering was performed for all 
200-bp intervals assigned to strong enhancer state 4 in at least one cell type using 
identical procedures. For the purposes of display in Fig. 2, the locations were 
randomly down-sampled. For the purpose of identifying enriched functional 
gene categories in Fig. 2b, enhancers were linked to the nearest TSS up to 50 kb 
distant excluding those within 5 kb. Enhancer-gene correspondences based on 
the nearest gene were used for the expression analysis of distance-based linked 
genes in Fig. 3b. 

Linking enhancer locations to correlated genes. To predict linkages between 
enhancer states and target genes, we combined distance-based information with 
multicell type correlations between gene expression levels and normalized signal 
intensities for histone modifications associated with enhancer states (H3K4mel, 
H3K4me2 and H3K27ac). For each enhancer state (4-7), cell type, and 200-bp 
interval between 5 kb and 125kb from the TSS, we trained logistic regression 
classifiers. The classifiers were trained to use mark intensity/expression correla- 
tion values to distinguish real instances of pairs of enhancer states and gene 
expression values from control pairs based on randomly re-assigning expression 
values to different genes. So that the classifiers learned a smooth and robust 
function at each position, we included as part of the training all enhancer state 
assignments within a 10-kb window centred at the position. The link score for a 
specific enhancer-gene linkage was defined as the ratio of the corresponding 
logistic regression classifier probability score to that for the randomized data. 

For the evaluation of the expression quantitative trait loci (QTL) analysis, we 
used a link score threshold of 2.5. The expression QTL data was obtained from the 
University of Chicago QTL browser (http://eqtl.uchicago.edu/cgi-bin/gbrowse/ 
eqtl/). In the QTL evaluation, each SNP that overlapped a strong enhancer state (4 
or 5), was within 125kb of a TSS, excluding locations within 5kb, and was 
associated with a gene for which we had expression data was considered eligible 
to be supported by our linked predictions. We computed the fraction we observed 
linked on the basis of our linked predictions relative to the fraction that would be 
expected to be linked conditioned on knowing the distance distributions of the 
SNPs relative to the gene TSS. 

For the evaluation of linked predictions using the Gene Ontology database, we 
used the same link score threshold and compared gene assignments against the 
distance-based assignments defined above. The base set of genes in the enrich- 
ment analysis here were all genes that could be linked in at least one cluster. 
Motif and transcription factor analysis. A database of known transcription 
factor motifs was collated by combining motifs from TRANSFAC” (version 
11.3), JASPAR® (2010-05-07) and protein-binding microarray data sets®'°’. 
Motif instances in non-coding and non-repetitive regions of the genome were 
identified using these motifs and sequence conservation using a 29-way align- 
ment of eutherian mammal genomes (K. Lindblad-Toh et al., submitted). These 
were filtered using a significance threshold of P< 4~* for the motifs, and a 
confidence level based on conservation. Motifs were linked to corresponding 
transcription factors using metadata provided by the source. Motif enrichments 
for chromatin state clusters were computed as ratios to the instances of shuffled 
motifs, to correct for non-specific conservation and composition. A confidence 
interval was calculated for each ratio using Wilson score intervals (z= 1.5), 
selecting the most conservative value within the confidence interval. In cases 
where multiple motif variants were available for the same transcription factor, 
the one that showed the most variance in enrichment across clusters was selected. 

For predicting causal activators and repressors, motif scores and transcription 
factor expression scores were correlated as follows. Motif scores were calculated 
as described above. Transcription factor expression scores were calculated for 
each cluster by correlating the expression of the transcription factor across the cell 
types with the activity profile of the enhancers in that cluster (defined by the 
cluster means from the k-means clustering). The motif scores and the transcrip- 
tion factor expression scores were then correlated against each other to identify 
positively and negatively correlated transcription factors. 

Transcription factor/motif interactions predicted for strong enhancer states in 
specific cell types were validated by using the raw ChIP-seq tag enrichments as 
proxy for nucleosome positioning. For this purpose, sequencing reads were pro- 
cessed as above, except that the middle 75 bp of inferred ChIP fragments were 
used to derive signal density informative of nucleosome depletion (dips), as 
previously described**. Superposition plots show tag enrichments relative to a 
uniform background computed on the basis of sequencing depth. 
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Quantitative real-time PCR. Enrichment ratios for RNAPII and H2A.Z ChIPs 
were determined relative to input chromatin by quantitative real-time PCR using 
an ABI 7900 detection system, in biological replicate as described previously”. 
Regions used for validation correspond to three different chromatin states, 
including 13 for state 1 (arbitrarily selected), 11 for state 4 (arbitrarily selected 
but excluding regions within 2kb of a state-1 annotation) and 11 for state 7 
(arbitrarily selected but excluding regions within 2kb of a state-1 or state-4 
annotation). PCR primers are listed in Supplementary Data 1. 

Functional enhancer assays. The SV40 promoter was first inserted between the 
HindIII and Ncol sites of pGL4.10 (Promega). Next, 250-bp sequences from the 
reference genome (hg18) corresponding to different chromatin states (eight from 
HepG2? state 4, seven from HepG2 state 7 and seven from GM12878 state 4) were 
synthesized (GenScript) and then inserted between the two Sfil sites upstream of 
the SV40 promoter. HepG2 cells were seeded into 96-well plates at a density of 
5 X 10* cells per well and expanded overnight to ~50% confluency. The cells were 
then transfected with 400 ng of a pGL4.10-derived plasmid and 100 ng of pGL4.73 
(Promega) using Lipofectamine LTX. Firefly and Renilla luciferase activities were 
measured 24h post-transfection using Dual-Glow (Promega) and an EnVision 
2103 multilabel reader (PerkinElmer), from triplicate experiments. Data are 
reported as light units relative to a control plasmid. For validation of causal 
transcription factor motifs, ten sequences of 250 bp corresponding to HepG2- 
specific strong enhancers (state 4) with dips and HNF motifs were tested as above, 
and compared with identical sequences except with the HNF motif permuted. 
Tested enhancer elements are listed in Supplementary Data 1. 

GWAS SNP analysis. The GWAS variants and SNP coordinates were obtained 
from the NHGRI catalogue and the UCSC browser*””* (October 30, 2010). This 
set was refined by extending the blood lipid GWAS"' set to contain all reported 
SNPs, and by bifurcating the haematological and biochemical traits study** into a 
haematological traits set and a biochemical traits set. We limited our analysis to 
studies reporting two or more associated SNPs. The variants from each study 
were intersected with chromatin states from each of the cell types. The reported P 
values were based on the overlap of associated SNPs with strong enhancer states 4 
and 5. We controlled for non-independence between proximal SNPs by using a 
randomization test where SNPs were randomly shifted while preserving relative 
distance. We then defined an estimated false-discovery rate based on permuta- 
tions in which SNPs were randomly re-assigned to different studies, and recom- 
puted P values. Estimates of false-discovery rates based on these permutations 
control for multiple testing of studies and cell types and for general non-specific 
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Low strength of deep San Andreas fault gouge from 


SAFOD core 


David A. Lockner', Carolyn Morrow’, Diane Moore! & Stephen Hickman! 


The San Andreas fault accommodates 28-34 mm yr _' of right lateral 
motion of the Pacific crustal plate northwestward past the North 
American plate. In California, the fault is composed of two distinct 
locked segments that have produced great earthquakes in historical 
times, separated by a 150-km-long creeping zone. The San Andreas 
Fault Observatory at Depth (SAFOD) is a scientific borehole located 
northwest of Parkfield, California, near the southern end of the 
creeping zone. Core was recovered from across the actively deform- 
ing San Andreas fault at a vertical depth of 2.7 km (ref. 1). Here we 
report laboratory strength measurements of these fault core materi- 
als at in situ conditions, demonstrating that at this locality and this 
depth the San Andreas fault is profoundly weak (coefficient of fric- 
tion, 0.15) owing to the presence of the smectite clay mineral sapo- 
nite, which is one of the weakest phyllosilicates known. This Mg-rich 
clay is the low-temperature product of metasomatic reactions 
between the quartzofeldspathic wall rocks and serpentinite blocks 
in the fault”. These findings provide strong evidence that deforma- 
tion of the mechanically unusual creeping portions of the San 
Andreas fault system is controlled by the presence of weak minerals 
rather than by high fluid pressure or other proposed mechanisms’. 
The combination of these measurements of fault core strength with 
borehole observations’*” yields a self-consistent picture of the stress 
state of the San Andreas fault at the SAFOD site, in which the fault is 
intrinsically weak in an otherwise strong crust. 

SAFOD is a deep scientific borehole that penetrates the San Andreas 
fault (SAF) at a vertical depth of approximately 2.7 km and is the deepest 
land-based scientific drilling project to cross a plate-bounding fault'*’ 
(see http://www.earthscope.org for additional information). During 
phase 2 drilling in 2005, the basic structure of the SAF was determined 
(Fig. 1) using borehole logging data’ and supplementary laboratory 
studies of the drilling cuttings*°®. At 2.7km depth, the damage zone 
associated with the fault is approximately 200 m wide, and two actively 
creeping strands were identified within it by accumulated deforma- 
tion of the steel casing in the main borehole’. These two active shear 
zones, referred to as the southwest deforming zone (SDZ) and the 
central deforming zone (CDZ), were primary targets of the phase 3 
multilateral core drilling operation in 2007. Approximately 31 m of core 
were recovered from across the SDZ, CDZ and adjoining damage-zone 
rocks, including 1.6m and 2.6m of highly foliated, incohesive fault 
gouge associated with the SDZ and CDZ, respectively. 

We have completed frictional strength measurements on 25 core 
samples that span the important lithologic units. Of these, 17 are 
detrital sedimentary rocks, ranging from fine-grained sandstones to 
mudstones; representative X-ray diffraction (XRD) patterns are pre- 
sented in Supplementary Fig. 2. The SDZ and CDZ are represented by 
four samples apiece. In marked contrast to the adjoining rocks, both 
foliated gouge zones consist of porphyroclasts of serpentinite and 
sedimentary rock dispersed in a matrix of Mg-rich clays’? (Sup- 
plementary Fig. 3). XRD patterns of the CDZ were dominated by 
saponite (estimated to be greater than 60% from petrographic analysis) 
with some quartz and calcite. The SDZ gouge was composed primarily 
of saponite+corrensite with some quartz and feldspars (corrensite is a 


regularly interlayered chlorite-saponite clay). The porphyroclasts are 
also partly altered to Mg-rich clays’. The two gouge zones are inter- 
preted to be the product of shearing-enhanced metasomatic reactions 
between serpentinite, tectonically entrained within the fault, and 
adjoining sedimentary rocks”. 
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Figure 1 | Location and strengths of SAFOD core samples. a, Map view of 
SAF damage zone and SAFOD boreholes from phase 2 (blue) (indicating 
actively deforming casing) and phase 3 Hole G (red) with location of recovered 
core (yellow) at approximately 2.7 km vertical depth. Active deformation zones 
are shown in orange. b, Frictional strength of core samples plotted versus 
measured depth along Hole G (at sliding rate V= 1.15 ms’). Active fault 
traces SDZ (3,196.4-3,198.1 m measured depth) and CDZ (3,296.6-3,299.1 m) 
have notably low strength. A few samples were tested with deionized (DI) 
water. Extrapolation to SAF plate rate reduces shear zone strength to ps ~ 0.15. 
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Sample strength is reported as coefficient of friction 1 = t/G;, where 
t and 6; are respectively shear stress and effective normal stress on the 
test faults; we estimate in situ @, to be ~122 MPa (see below). Here, 
O,;=0,—p, where o, is normal stress and p is pore pressure. 
Representative strength tests are plotted in Fig. 2. Frictional strength 
was compiled from all deformation tests at 9.8-10.4 mm fault-parallel 
slip and sliding rate V= 1.15 um s_‘. As shown in Fig. 2, nearly all 
time- and slip-dependent strengthening had ended by 10 mm slip, so 
that residual strength is reasonably represented by this value. Residual 
strength refers to the stable strength of the test sample once fully 
developed shear flow is established. Because control parameters— 
including o,, p, V and pore fluid composition—are duplicated in the 
tests, variations in jy are attributed to mineralogical differences 
between samples. Samples outside the two shear zones show a gradual 
weakening trend, from 1 ~ 0.6 on the southwestern side to 1 ~ 0.4 on 
the northeastern side. This trend reflects a compositional change from 
more quartz-rich sandstone and siltstone on the southwestern side to 
more phyllosilicate-rich mudstones to the northeast (Supplementary 
Fig. 2). The SDZ marks the southwestern boundary of the damage 
zone, so that the samples with the highest residual strength reside 
outside the damage zone (1 ~ 0.50-0.65). 

The most significant strength observation is the abrupt decrease in ju 
within the two actively deforming shear zones. All residual strength 
measurements of the foliated gouge (at 10.4mm slip) yield 1 = 0.21; 
the weakest sample has a strength of ju = 0.13 (Fig. 1). A partly altered 
serpentinite porphyroclast from the SDZ has a strength of pu ~ 0.26 
and apparently survived by weaker matrix material flowing around it. 
The very low measured strengths are attributed to the abundance of 
the extremely weak mineral saponite (11 ~ 0.05) (Fig. 2). Petrographic 
analysis indicates a saponite volume fraction of 60-65% in the foliated 
gouge matrix. Corrensite was also found in the SDZ and, based on 
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Figure 2 | Four representative deformation tests of core material from Hole 
G, with saponite for comparison. Periodic strength steps are due to decade 
changes in sliding rate (fault velocity in ym s ': fast (F), 1.15; medium (M), 
0.115; slow (S), 0.0115). Strength variations are attributed to compositional 
differences between samples as shown in XRD patterns. Bottom curve shows 
strength of monomineralic saponite gouge taken from vesicles in altered 
volcanic rocks from the Isle of Skye, Scotland (obtained from Mineralogical 
Research Co.). Permanent strengthening during some slow velocity steps is the 
result of time-dependent compaction. Foliated gouge is 3-4 times stronger than 
pure saponite owing to the presence of strong minerals like quartz. MD, 
measured depth. 
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composition, is likely to have a strength of 0.05<y<0.4. Thus, 
corrensite along with increased quartz content (Supplementary Fig. 2) 
may be responsible for the marginally stronger frictional strength of the 
SDZ. Serpentine and minor amounts of other phyllosilicates (including 
chlorite, illite and micas) are present in the foliated gouge and, when 
added to the stronger quartz, feldspar and calcite constituents, result in a 
matrix strength that is consistent with estimates suggested by mixing law 
studies'* 7°. Rock fabric that localizes weak minerals can lower frictional 
strength relative to the strength of ground and mixed samples'*. Thus, 
the SAFOD foliated gouge may be even weaker in its undisturbed state 
than the values reported here. 

Shear strength of fault gouge material typically varies with sliding 
rate. Rate dependence can be important in determining deformation 
mode (stable or unstable) and in extrapolating shear strength to SAF 
deformation rates. Steady-state rate sensitivity is defined’* by the para- 
meter (a — b) = du,,/dInV, where pus, is the steady-state friction coef- 
ficient at velocity V. Imposed velocity steps, as shown in Fig. 2, are used 
to determine a — b. Negative values promote unstable slip, whereas 
positive values are likely to result in stable creep. The serpentinite 
porphyroclast from the SDZ shows a range of both negative and positive 
values (a — b = +0.0004 + 0.0014) similar to past values reported for 
serpentinites’®'”. All other core samples have positive rate sensitivity. 
Samples taken from outside the foliated gouge zones have values in the 
range +0.001 <a—b< +0.007. For CDZ, the combined measure- 
ments resulted in a — b = +0.0018 + 0.0008. For SDZ, values are twice 
as large: a — b = +0.0037 + 0.0007. 

Average in situ strengths for CDZ and SDZ gouges (Fig. 1) are 
= 0.16 and 0.19, respectively. These measurements are determined 
for a slip rate of 1.15 um s | (36,000 mm yr ') and should be reduced 
to the appropriate in situ deformation rate (<34 mm yr_') of the SAF. 
(Note that the slowest imposed deformation rate in the strength tests 
(0.0115 ums’, Fig. 2) is only about 11 times the in situ rate.) 
Extrapolation of test strengths using the observed rate sensitivity for 
the foliated gouge indicates upper bounds for steady-state strength of 
CDZ and SDZ, respectively, of 1 = 0.14 and 0.16. This scaling assumes 
that the slip rate across the 1-mm-thick test gouge layer should be 
compared to the SAF deformation rate that is accommodated by the 
combined thickness of the SDZ and CDZ (~4.2 m). Depending on 
how strain is partitioned within the shear zones, the actual in situ shear 
strength supported by the deforming zones could be much less. 

Although the SAF is one of the most well-studied fault systems in 
the world, fundamental questions about its strength and mechanical 
properties remain unanswered’*. The SAF heat flow paradox was 
identified more than 40 years ago and is debated to this day'’””’. 
Essentially, if the shear strength of the SAF were consistent with com- 
mon laboratory-derived Byerlee rock friction (4 > 0.6), frictional heat- 
ing of the fault during earthquakes and stable fault creep should result 
in increased temperature and heat flow adjacent to the fault zone. In 
addition, the maximum horizontal stress near the fault should be 
oriented at ~30° to the fault trace. However, no evidence of a heat- 
flow anomaly along the creeping section of the SAF has been found”, 
and borehole stress observations at SAFOD confirm that the max- 
imum horizontal stress at this locality is at a high angle to the fault 
trace**’®. Although formation fluid pressure is apparently above 
hydrostatic in the sedimentary sequence northeast of the fault, there 
is no evidence from SAFOD drilling operations that pore pressure 
within the fault zone is elevated relative to the country rock’. The direct 
measurement, reported here, of low frictional strength (1 ~ 0.15) of 
foliated gouge material taken at depth from the actively deforming 
shear zones is consistent with both the lack of an observed heat flow 
anomaly and the maximum compressive stress oriented at a high angle 
to the fault trace. Also, the positive dependence of strength on slip rate 
of the fault gouge material is consistent with deformation by creep 
rather than by earthquakes. 

Saponite becomes unstable above about 150°C (ref. 26) and is 
unlikely to be found deeper in the fault zone than 3.5-4 km (observed 
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Figure 3 | Stress state for SAFOD drill site at 2.7 km depth. Model is based 
on borehole observations (assuming hydrostatic pore pressure) and foliated 
gouge strength (blue, host rock; red, weak shear zone). Main panel shows 
relationship of stress states within and outside the weak shear zone. Inset shows 
the corresponding spatial orientation of horizontal stresses in the model. While 


temperature within the fault at ~2.7km depth at SAFOD was 110- 
115°C; C. Williams, personal communication). Stable creep and low 
strength of the deep SAF in the creeping section may reflect the presence 
of other low-strength minerals, elevated fluid pressure, or enhanced 
chemical weakening at greater depth than penetrated by SAFOD. Still, 
when considering the mechanics of the SAF specifically at 2.7 km, at 
SAFOD, mineralogy alone appears sufficient to explain fault strength. 

Rice”’ analysed the stress state of a weak fault (due to elevated pore 
pressure, p) embedded in a stronger crust in a transpressional regime. 
Tembe et al.** extended the Rice analysis to include fault gouge of 
arbitrary strength. We use oy and o,, for maximum and minimum 
horizontal principal stresses, respectively, and denote values within the 
fault by a superscript ‘f. Following Tembe et al., a stress diagram 
representative of the SAFOD site at 2.7km depth, with ue = 0.14, 
[i = 0.60 and hydrostatic p, is shown in Fig. 3. The model requires that 
Oy (outside the weak shear zones) makes an angle of 77° to the strike of 
the SAF and is consistent with borehole observations at SAFOD show- 
ing high differential stresses in a transitional strike-slip to reverse- 
faulting stress regime, with oy maintaining a high angle to the SAF 
(70-80°) at depth*®. Tractions on the fault, in this model, are 
t = 17 MPa and @, = 122 MPa. 

SAFOD phase 3 drilling has provided, for the first time, continuous 
core samples from the actively deforming SAF at a depth of 2.7 km. A 
self-consistent picture is emerging about the strength and deformation 
processes of this complex portion of the SAF that represents the transi- 
tion from locked to creeping portions of the fault. Measurements of 
frictional strength of core material from within the SAF damage zone 
show two low-strength (j1 ~ 0.15) foliated gouge zones that are 1.6 and 
2.6m wide. These zones correspond to the actively creeping shear 
zones that were independently identified by casing deformation within 
the phase 2 hole. These shear zones are embedded in stronger material 
with ju ~ 0.35-0.65. The extremely low strength of the foliated gouge in 
an otherwise strong crust is sufficient to explain the observed orienta- 
tion of maximum compressive stress at a high angle relative to the 
strike of the fault (Fig. 3) without invoking high fluid pressure or other 
proposed fault-weakening mechanisms. 


METHODS SUMMARY 


We measured frictional strength of 25 samples obtained from the SAFOD phase 3 
Hole G core, composed of material that could be carved from the core or removed 
as chips or rubble. Some portions of the rock mass bounding the shear zones had 
sufficient cohesion to be sampled as solid mini-cores or prisms and will be 
reported on later. Samples were ground to a powder (<150 um diameter) and 


maximum shear stress in host rock is high, principal stresses rotate within the 
shear zone to accommodate the weaker material. In the fault, mean stress is 
high but shear stress is low. In model: a4 = 153 MPa; op = 76; oy = 67.5; 

p = 27. See text for definitions of symbols used. 


sheared in 1- or 2-mm-thick gouge layers between 25.4-mm-diameter driving 
blocks in a triaxial deformation apparatus’ (Supplementary Fig. 1). Most samples 
were saturated with brine equivalent to in situ pore fluid (Y. Kharaka and J. 
Thordsen, personal communication) at 1 MPa constant pore pressure; a few tests 
were conducted with deionized water. Tests were performed at room temperature, 
constant effective normal stress (40, 120 and 200 MPa) and constant sliding rate 
(0.0115, 0.115 and 1.15 ums _'). Tests were carried out to 200 MPa to be applicable 
to in situ stress conditions (Fig. 3) and to allow for interpolation to other depths. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Sample preparation. We measured frictional strength of 25 samples that were 
obtained from the SAFOD phase 3 Hole G core. Hole G was cored in measured 
depth intervals 3,186.7-3,199.5 m and 3,294.9-3,312.7 m to sample localized shear 
zones within the SAF damage zone that correspond to the two intervals, referred to 
as SDZ and CDZ, where slow deformation was observed in the Phase 2 casing’. As 
indicated in Fig. 1, the cored intervals in Hole G were within or adjacent to the SAF 
damage zone as determined by logging data following phase 2 drilling. While 
selected samples were obtained to provide whole, undisturbed wafers for intact 
strength tests, samples tested here were either carved from the core or collected as 
rubble, chips or loose powder. All samples reported here were prepared by 
repeated gentle grinding with mortar and pestle until all material passed through 
a 100 mesh sieve (0.15 mm diameter). Resulting powder was then wetted to make a 
paste that was formed into a 1-mm-thick test layer (2-mm layers were used in the 
200 MPa tests). The first 14 tests were performed with deionized water. All remain- 
ing tests used a prepared brine solution that duplicates the major cations and their 
relative concentrations found in the formation fluid retrieved from the SAFOD 
drill hole on the northeastern side of the SAF (Y. Kharaka and J. Thordsen, 
personal communication). Test fluid constituents, expressed in units of grams 
per litre, are: Cl, 13.32; Na‘, 5.34; Ca?*, 2.77; and K*, 0.22. Comparison tests 
showed only a slight difference between frictional strength for samples sheared 
with deionized water and samples sheared with the brine solution. Before mech- 
anical testing, X-ray diffraction patterns were obtained to determine mineral 
composition and relative abundance. 

Testing details. Tests were performed in a standard triaxial apparatus at room 
temperature and effective normal stresses of 40, 120 and 200 MPa. A constant pore 
pressure of 1 MPa was applied in all tests. Samples were 25.4-mm-diameter right- 
cylinders that contained a sawcut inclined 30° to the sample axis (Supplementary 
Fig. 1). The sawcut forcing blocks were sandstone-sandstone, sandstone-granite, 
or granite-granite pairs. Surfaces of forcing blocks were roughened with 100 grit 
abrasive, to assure good frictional contact with the applied gouge layer. See, for 
example, refs 9 and 13 for details. Berea sandstone forcing blocks have high 
permeability but also have 20% porosity that decreases with applied load. The 
standard test geometry used Berea for the top forcing block to assure rapid 
hydraulic communication of the fault with the external pore fluid system. Pore 
fluid flow in and out of the lower driving block was through the fault gouge. To 
minimize pore pressure transients due to stress changes, low-porosity granite was 
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used for the lower driving block in most tests, and particularly in experiments with 
low-permeability clay-rich gouge. 

A greased Teflon shim, placed between the piston and the sample assembly, 
allowed easy lateral slip of the lower driving block during shearing (Supplementary 
Fig. 1). Samples were jacketed in 3.2-mm-thick latex tubing for isolation from 
confining fluid. Separate calibration tests showed that the latex tubing provided an 
equivalent shear resistance of 0.043 MPa mm _' due to stretching during deforma- 
tion experiments. This shear strength correction has been applied to all test results. 
Shear and normal stresses have also been corrected for the reduction in contact 
area as the two sample halves slide past each other'’. Axial load was measured with 
an internal load cell. Axial shortening, confining pressure and pore pressure were 
all measured at 1-s intervals. Shear and normal stress resolved on the fault surface 
were computed in real time from the axial stress, confining pressure and axial 
shortening. As sample strength varied, confining pressure was adjusted every 
second to maintain constant normal stress. Axial stress, confining pressure and 
pore pressure have accuracies of at least 0.03 MPa. Samples were sheared to 9 mm 
axial shortening (~10.4mm parallel to the sawcut) at axial shortening rates of 
0.01, 0.1 and 1.0 ums __ to determine the dependence of shear strength on sliding 
rate and thereby the tendency for stable creep or unstable slip. Slip and slip rate on 
the inclined fault surfaces were 15% higher than the corresponding axial values. 
Steady-state changes in strength were estimated for individual velocity steps by 
measuring the residual strength change after de-trending the friction-displacement 
curves for long-term strain hardening. This procedure was carried out manually. 
Measurement accuracy. Sample strength is reported as coefficient of friction 
H=t/Gpy. Within a single experiment, changes in ju have a precision of +0.001. 
Reproducibility of . between experiments, including variations due to sample 
preparation, is approximately +0.005. Accuracy of ju, after corrections for true 
contact area and jacket strength, is approximately +0.01. Initial gouge layer thick- 
ness is 1.0mm for 40 and 120 MPa tests. Gouge layer thickness is 2.0 mm for 
200 MPa tests to offset thinning at the higher normal stress. Compaction is not 
measured during experiments, but layer thickness following experiments is 
reduced by 5-30%, depending on normal stress, gouge clay content and driving 
block type. As shear will localize within the gouge layer to different degrees and at 
different times, depending on composition and normal stress, estimates of true 
shear strain are problematic. The deformation quantity that is most accurately 
determined in these experiments is total fault-parallel slip. This can be converted 
to a nominal shear strain by dividing by the initial 1 or 2mm gouge thickness. 
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Shank3 mutant mice display autistic-like 
behaviours and striatal dysfunction 


Joao Pecal*, Catia Feliciano'**, Jonathan T. Ting’, Wenting Wang', Michael F. Wells!, Talaignair N. Venkatraman‘, 


Christopher D. Lascola*, Zhanyan Fub?° & Guoping Fengh®” 


Autism spectrum disorders (ASDs) comprise a range of disorders that share a core of neurobehavioural deficits characterized 
by widespread abnormalities in social interactions, deficits in communication as well as restricted interests and repetitive 
behaviours. The neurological basis and circuitry mechanisms underlying these abnormal behaviours are poorly understood. 
SHANKS3 is a postsynaptic protein, whose disruption at the genetic level is thought to be responsible for the development of 
22q13 deletion syndrome (Phelan-McDermid syndrome) and other non-syndromic ASDs. Here we show that mice with 
Shank3 gene deletions exhibit self-injurious repetitive grooming and deficits in social interaction. Cellular, 
electrophysiological and biochemical analyses uncovered defects at striatal synapses and cortico-striatal circuits in Shank3 
mutant mice. Our findings demonstrate a critical role for SHANK3 in the normal development of neuronal connectivity and 
establish causality between a disruption in the Shank3 gene and the genesis of autistic-like behaviours in mice. 


Autism and autism spectrum disorders (ASDs) are neurodevelopmental 
disorders diagnosed based on a triad of criteria: deficits in communica- 
tion, impaired social interaction, and repetitive or restricted interests and 
behaviours’. ASDs are highly heritable disorders with concordance rates 
as high as 90% for monozygotic twins’. Recent genetic and genomic 
studies have identified a large number of candidate genes for ASDs’, 
many of which encode synaptic proteins**, indicating synaptic dysfunc- 
tion may have a critical role in ASDs”*. One of the most promising ASD 
candidate genes is Shank3, which codes for a key postsynaptic density 
(PSD) protein at glutamatergic synapses. Disruption of Shank3 is 
thought to be the cause of core neurodevelopmental and neurobeha- 
vioural deficits in the 22q13 deletion syndrome (Phelan-McDermid 
syndrome), an autism spectrum disorder®"'. Furthermore, recent genetic 
screens have identified several mutations/rare variants of the Shank3 
gene in ASD patients outside of diagnosed 22q13 deletion syndrome’*”’. 
The Shank family of proteins (SHANK1-3) directly bind SAPAP (also 
known as DLGAP) to form the PSD-95-SAPAP-SHANK complex'* 
(PSD-95 is also known as DLG4). This core of proteins is thought to 
function as a scaffold, orchestrating the assembly of the macromolecular 
postsynaptic signalling complex at glutamatergic synapses. Currently, 
however, little is known about the in vivo function of SHANK3 at the 
synapse and how a disruption of Shank3 may contribute to ASDs. Here 
we demonstrate that genetic disruption of Shank3 in mice leads to com- 
pulsive/repetitive behaviour and impaired social interaction, resembling 
two of the cardinal features of ASDs. Biochemical, morphological and 
electrophysiological studies revealed synaptic dysfunction at cortico- 
striatal synapses, part of the neural circuits strongly implicated as dys- 
functional in ASDs. Our studies provide a synaptic and circuitry 
mechanism underlying Shank3 disruption and ASD-like behaviours. 


Shank3B~‘~ mice display repetitive grooming 
The Shank3 gene codes for large proteins with multiple protein- 
protein interaction domains (Fig. la). We generated two different 


alleles of SHANK3 mutant mice. In Shank3A mutant mice, we targeted 
a portion of the gene encoding the ankyrin repeats (Supplementary Fig. 
1b). This resulted in a complete elimination of SHANK3,, the longest 
SHANK3 isoform (Fig. 1b). However, the other two isoforms were not 
affected (here named SHANK3, and SHANK3,,). In Shank3B mutants, 
we targeted the fragment encoding the PDZ domain (Supplementary 
Fig. 1c). This led to the complete elimination of both SHANK3,, and 
SHANK3, isoforms and a significant reduction of the putative 
SHANK3,, isoform at the PSD (—42.12% + 9.27% of control, n = 3, 
P <0.05) (Fig. 1b). Our analysis is mainly focused on the Shank3B /~ 
mutants due to their more pronounced behavioural and physiological 
defects. 

We used mice with a hybrid genetic background to avoid the potential 
contribution to behavioural phenotypes of homozygous genetic variants 
ona pure inbred background’”*. Initially, F1 hybrids from heterozygous 
X heterozygous matings were generated and homozygous mice were 
born at an expected Mendelian rate. However, homozygous knockout 
mice from this type of mating are smaller than their wild-type litter- 
mates, presumably due to inadequate competition for resources during 
early postnatal days leading to different developmental trajectories. We 
postulated that this size difference would influence our behavioural 
tests. To alleviate this confound, heterozygous animals were crossed 
in direct brother-sister matings for five generations from which we 
derived F5 isogenic hybrids in a mixed background. These isogenic 
animals were then used to generate time-mated homozygous < homo- 
zygous breeding pairs to obtain wild-type and mutant animals used in 
the experiments. F5 Shank3A and F5 Shank3B knockouts from these 
matings are reared to weaning age with body weights similar to those 
from F5 control animals. 

Shank3B ‘~ mice did not display any gross anatomical or histolo- 
gical brain abnormality, but on rare occasions exhibited seizures dur- 
ing handling in routine husbandry procedures. However, spontaneous 
seizures were never observed. By the age of 3-6 months, Shank3B ‘~ 
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Figure 1 | Excessive grooming, skin lesions and anxiety-like behaviour in 
Shank3B~’~ mice. a, SHANK3 protein structure. b, Western blot showing a 
pan-SHANK3 antibody staining in brain lysate, synaptosomal plasma 
membrane (SPM) and 2X Triton X-100-washed PSD (PSD) fraction in wild- 
type (WT), Shank3A~/~ and Shank3B ‘~ mice. c, Four-month-old 
Shank3B ‘~ mice display neck and head lesions (arrows). d, Pre-lesion 
Shank3B ’~ (KO) mice spent more time in self-grooming than WT. e, In the 
open field test, Shank3B ‘~ mice, when compared to controls, display 
decreased rearing activity. f, In the zero maze test, Shank3B~‘~ mice spent less 
time in the open area than wild-type controls. *P < 0.05, ***P < 0.001, two- 
tailed t-test for d and f, two-way repeated measures ANOVA with post hoc two- 
tailed t-test for e; all data are presented as means ~ s.e.m. from 6-9 mice per 


genotype. 


mice developed pronounced skin lesions with varying degrees of 
phenotypical penetrance: approximately 35% in the general holding 
colony (Fishers exact test, P< 0.0001), and 100% in mating females 
that have produced 4-6 litters. The lesions tend to appear first on 
the back of the neck or on the face (Fig. 1c) and usually progressed 
bilaterally to cover large areas of the body. The lesions were self- 
inflicted, as they were present in animals socially isolated at weaning 
age, and not due to excessive allogrooming, as no lesions were found in 
wild-type or Shank3B*’~ mice housed from birth with Shank3B /~ 
animals. Furthermore, 24 h videotaping in pre-lesion animals revealed 
that Shank3B ‘~ mice showed an increase in time spent grooming 
when compared to wild-type controls (Fig. 1d). These observations 
indicate that Shank3B ‘~ mice display excessive grooming and self- 
injurious behaviour. 

We characterized the animals further in a battery of behavioural 
tests. In the rotarod motor test, Shank3B ’~ and control animals per- 
formed at similar levels (Supplementary Fig. 2). In the open field test, 
when compared to controls, Shank3B ‘~ mice showed similar levels of 
activity and thigmotaxis (Supplementary Fig. 2). However, rearing, 
which is a form of vertical exploration considered to be anxiogenic 
for mice, was significantly reduced in the mutants (Fig. le). In the 
elevated zero maze, the Shank3B ‘~ mice spent less time exploring 
the open arms of the maze versus the closed arms (Fig. 1f). In the 
light-dark emergence test, the Shank3B ’~ mice displayed an increased 
latency to cross into the brightly lit area, although the time spent in each 
side of the box was similar between mutant animals and controls 
(Supplementary Fig. 2). Thus, the Shank3B ‘~ mice display an anxiety- 
like behaviour and excessive, self-injurious grooming. In contrast, 
Shank3A~‘~ mice displayed no lesions or anxiety-like behaviour 
(Supplementary Fig. 3). 
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Social interaction deficits in Shank3B™'~ mice 

Deficits in social interaction are the most recognizable manifestation of 
autistic behaviours in humans. We used a modified version of a three- 
chamber social arena’’ to probe animals for their voluntary initiation of 
social interaction and their ability to discriminate social novelty. Initially, 
the test animal was left to explore and initiate social contact with a 
partner (‘Stranger 1’) held inside a wired cage or an identical but empty 
wired cage (‘Empty cage’). In this test, the Shank3B ‘~ mice displayed 
dysfunctional social interaction behaviour, as measured by observing 
both time spent in the compartment containing the social partner 
(Fig. 2a, b) or in close interaction (Fig. 2d). Notably, Shank3B ~~ mice 
exhibited a clear preference for interacting with the empty cage rather 
than with the social partner (Fig. 2a, d). In a subsequent trial, a novel 
social partner (‘Stranger 2’) was introduced into the previously empty 
wired cage. Wild-type mice displayed a preference for the novel animal, 
as shown by the increase in time spent in the compartment containing 
‘Stranger 2’. The Shank3B ‘~ mutants markedly spent more time in the 
centre chamber (Fig. 2c) and a reduced amount of time closely interact- 
ing with either social partner (Fig. 2e). In an identical test, the 
Shank3A~’~ mice displayed normal initiation of social interaction, 
but perturbed recognition of social novelty (Supplementary Fig. 4). 

Additionally, in an open arena test, freely interacting dyadic pairs of 
wild-type-Shank3B ‘~ mice displayed less time spent in reciprocal 
interaction, a lower frequency of nose-to-nose interaction and anogenital 
sniffing when compared to wild-type—wild-type pairs (Supplementary 
Fig. 5). Thus, data from both social interaction tests indicate that 
Shank3B ‘~ mice display abnormal social interaction as well as deficits 
in discriminating social novelty. 

In our breeding scheme, Shank3B-’~ mice and wild-type mice 
were nurtured by Shank3B ’~ and wild-type dams, respectively. To 
assess the impact of maternal rearing on the observed sociability 
defects, we performed time-mated cross-fostering of Shank3B <a 
mice and controls. Cross-fostering of Shank3B ‘~ neonatal pups with 
wild-type dams (KO,,) revealed qualitatively equivalent social defects 
in the mutant mice as compared to those observed in mutant mice 
nurtured by Shank3B ‘~ dams. Additionally, rearing wild-type neonatal 
pups by Shank3B ’~ dams (WT, did not perturb normal sociability in 
wild-type animals (Supplementary Fig. 6). These data further indicate a 
genetic origin of the abnormal social behaviours in the Shank3 mutant 
mice. 


Altered PSD composition in the striatum 


The basal ganglia are one of the brain regions implicated as dysfunc- 
tional in ASD. The repetitive/compulsive grooming behaviour in 
Shank3B ‘~ mice also suggests defects in cortico-striatal function. 
Furthermore, Shank3, but not Shank1 or Shank2, is highly expressed 
in the striatum (Fig. 3a) (Supplementary Fig. 7 and Supplementary 
Table 1). Therefore, we focused our analyses on striatal neurons and 
cortico-striatal synapses. 

Shank family members have been proposed as key regulators of the 
PSD at glutamatergic synapses'®. To determine how the disruption of 
Shank3 may affect the PSD protein network, we used biochemically 
purified PSDs from the striatum of wild-type and Shank3B /~ mice 
and performed semiquantitative western blotting for several scaffold- 
ing proteins (Fig. 3b) and glutamate receptor subunits (Fig. 3c). At the 
PSD level, we observed reduced levels of SAPAP3, Homer-1b/c and 
PSD-93 (also known as HOMER! and DLG2, respectively; Fig. 3b) as 
well as a reduction in the glutamate receptor subunits GluR2, NR2A 
and NR2B (also known as GRIA2, GRIN2A and GRIN2B, respec- 
tively; Fig. 3c). These results suggest an altered molecular composition 
of postsynaptic machinery in the striatum and a possible disruption of 
glutamatergic signalling. 


Morphological defects of medium spiny neurons 
To test whether disruption of SHANK3 affects neuronal morphology, 
we traced Golgi-stained striatal medium spiny neurons (MSNs) and 
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Figure 2 | Reduced social interaction and abnormal social novelty 
recognition in Shank3B ‘~ mice. a, Representative heat map analysis from 
‘Stranger 1 - Empty’ and ‘Stranger 1 - Stranger 2’ trials from Shank3B ‘~ mice 
and controls. b, In the social interaction test, Shank3B /~ mice (closed bars) 
spent less time in the chamber containing the social partner (Stranger 1) and 
more time in the chamber containing the empty wire cage when compared to 
controls (open bars). ¢, In the social novelty test, Shank3B ‘~ mice do not 
display a preference for the novel social partner (Stranger 2), and spent more 
time in the middle chamber. d, e, When analysing social interaction by close 
proximity (within 5 cm) to either ‘Stranger 1’, Empty Cage’ (d), or ‘Stranger 1’, 
‘Stranger 2’ (e), Shank3B ‘~ mice displayed a clear reduction in social 
interaction when compared to controls (d); whereas under a social novelty 
paradigm (e), Shank3B-/~ mice displayed a clear reduction in time spent with 
‘Stranger 2’ *P<0.05, **P<0.01, ***P<0.0001; one-way ANOVA, with 
Bonferroni post hoc f-test for b-e; all data presented as means + s.e.m.; 12-14 
mice per group. 


their dendrites to investigate the cellular morphology and complexity 
of these cells. Sholl analysis revealed neuronal hypertrophy as mea- 
sured by an increase in complexity of dendritic arborizations (Fig. 4a), 
total dendritic length (Fig. 4b) and also an increase in surface area 
(Fig. 4c) in Shank3B™ ‘~ MSNs. 

Next, we performed patch-assisted Lucifer Yellow cell filling of 
MSNs and measured spine density in control and Shank3B ’~ mice. 
Shank3B ’~ mice displayed a significant reduction in spine density 
(Fig. 4d, e). We did not observe significant changes in spine length or 
head diameter; however, the neck width of Shank3B ’— MSN spines 
was slightly larger than that of controls (Supplementary Fig. 8). 
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Figure 3 | Biochemical changes in striatal synapses of Shank3B ’~ mice. 
a, Only Shank3 mRNA is highly expressed in the striatum. b, Protein levels of 
the scaffolding proteins SAPAP3, Homer and PSD-93 are reduced in striatal 
PSD fractions from Shank3B /~ mice. «CaMKII and NLG-3 are also known as 
CAMK2A and NLGN3, respectively. c, Protein levels of glutamate receptor 
subunits GluR2, NR2A and NR2B are reduced in striatal PSD fractions from 
Shank3B‘~ mice. GluR1 and NR1 are also known as GRIA1 and GRINI, 
respectively. Each lane was loaded with 3 ug of protein with B-actin as loading 
control and normalized to wild-type levels. *P < 0.05, **P < 0.01, 

***P < 0,001, two-tailed t-test; all data are presented as means + s.e.m.; n = 3 
samples per group, with each sample being a combined pool of striatal tissue 
from three animals. 


Finally we analysed PSD morphology by electron microscopy 
(Fig. 4f). We found a significant reduction in mean thickness 
(Fig. 4g) of PSDs from Shank3B /~ mice relative to controls. 
Additionally, PSD length was also significantly reduced in the 
Shank3B ‘~ mice (Fig. 4h). Taken together, these results highlight 
a critical in vivo role for Shank3 in the normal development of med- 
ium spiny neurons and striatal glutamatergic synapses. 


Striatal hypertrophy in Shank3B~‘~ mice 

Even though there is no clear correlation between brain size or neur- 
onal hypertrophy specifically for Shank3 disruptions in humans, a 
potential link between enlarged brain size, neuronal hypertrophy 
and autism has been suggested previously”. In particular, increased 
caudate volume in autism patients has been proposed to be linked to 
repetitive behaviours’. We measured striatal volume using three- 
dimensional magnetic resonance imaging in the intact brain of 
Shank3B ‘~ and control mice. We found that there was no significant 
difference in overall brain size between the genotypes. However, mea- 
surement of caudate volume in the same animals revealed a small but 
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Figure 4 | Morphological and ultrastructural neuronal abnormalities in 
Shank3B’~ mice. a, Sholl analysis reveals an increased neuronal complexity 
of Shank3B ’~ MSNs (red) when compared to MSNs from wild-type mice 
(grey); example neurons are shown as insets (top, WT-; bottom, KO-). 

b, c, MSNs from Shank3B ‘~ mice show an increase in total dendritic length 
(b) and surface area (c) when compared to controls. d, Representative confocal 
stacks of dye-filled MSNs from KO and WT mice; scale bar; 1 tm. e, Spine 
density in MSNs from Shank3B ‘~ mice is lower than that of wild-type MSNs. 
f, Examples of electron micrographs depicting the synaptic contacts with 
presynaptic vesicles (arrowheads), postsynaptic densities (arrow) and dendritic 
spine (asterisk); scale bar, 100 nm. g, Shank3B‘~ PSDs are thinner than wild- 
type PSDs. h, Shank3B ‘~ PSDs are shorter than wild-type PSDs. *P < 0.05, 
**P < 0.01, ***P < 0.0001; two-way repeated measures ANOVA for a; two- 
tailed t-test for b, c and e; two-sample Kolmogorov-Smirnoy test for g and 

h. Data in g and h are presented as cumulative frequency plot with histogram 
distribution and Gaussian curve fit for the insets. Data from b, c and e are 
presented as means + s.e.m.; n = 36 from 3 wild-type mice and n = 36 from 3 
Shank3B ‘~ mice for a-c; n = 41 dendritic segments from 3 wild-type mice 
and n = 36 dendritic segments from 3 Shank3B ‘~ mice for e; n = 144 PSDs 
from three wild-type mice and n = 140 PSDs from three Shank3B ‘~ mice for 
g, h. 
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significant volumetric enlargement of this structure in Shank3B ‘~ 
mice (Supplementary Fig. 9). These data suggest a correlation between 
neuronal hypertrophy and brain volume, consistent with studies from 
other mouse models of ASD”. 


Perturbation of striatal postsynaptic function 

To elucidate the functional consequences of a disruption in Shank3 on 
synaptic function, we performed recordings of cortico-striatal synaptic 
circuitry in acute brain slices of 6-7-week-old animals. We found that 
field population spikes were significantly reduced in Shank3B ‘~ mice 
when compared with controls (Fig. 5a). Presynaptic function was not 
altered, as indicated by the relationship of stimulation intensity to the 
amplitude of the action potential component of the response termed 
negative peak 1 (NP1) and the paired-pulse ratio (PPR; Supplementary 
Fig. 10). These results indicate that the reduction in total field res- 
ponses was most likely due to a postsynaptic impairment in synaptic 
function and/or a reduction in the number of functional synapses. 
Consistent with their mild behavioural phenotypes Shank3A ‘~ mice 
displayed minimal disruption at cortico-striatal synapses (Supplemen- 
tary Fig. 11). 
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-striatal synaptic transmission in Shank3B /— 
MSNs. a, Cortico-striatal pop spike amplitude is decreased in Shank3B ‘~ 
mice (red trace) as measured by extracellular field recordings. Inset, example 
traces for Shank3B ’~ (KO) and wild-type (WT). b, mEPSC example traces 
from wild-type and Shank3B ’~ MSNs recorded with whole-cell voltage 
clamp. c, d, Reduced mEPSC frequency (c) and amplitude (d) in Shank3B /~ 
MSNs when compared to wild-type. e, PPR is unaltered in Shank3B-/~ MSNs. 
**P < 0.01, ***P < 0.001; two-way repeated measures ANOVA, with 
Bonferroni post hoc test for a; two-tailed t-test for c, d; all data presented as 
means = s.e.m. For field recordings, n = 13 slices from four mice per group; for 
mEPSCs, n = 29 MSNs from wild-type mice, n = 32 MSNs from Shank3B /~ 
mice. 
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We next performed whole-cell voltage clamp recordings of 
a-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptor- 
miniature excitatory postsynaptic currents (AMPAR-mEPSCs) in 
dorsolateral striatal MSNs. We found that the frequency of mEPSCs 
was significantly reduced in Shank3B ’~ MSNs (Fig. 5b, c), indicating 
a reduction in the number of functional synapses in Shank3B /~ 
MSNs because we did not observe defects on presynaptic function 
by measuring PPR (Fig. 5e). We also found a significant reduction of 
peak mEPSC amplitude in Shank3B ~/~ MSNs (Fig. 5b, d), indicating 
a reduction in the postsynaptic response from the available synapses. 
We did not observe significant differences in N-methyl-D-aspartate 
receptor (NMDA)/AMPA receptor-mediated current ratio in 
Shank3B ‘~ neurons (Supplementary Fig. 12). Finally, similar defects 
in mEPSC frequency and amplitude were observed in Shank3B /~ 
and wild-type littermate mice obtained from heterozygous matings 
(Supplementary Fig. 13). Together, these data demonstrate a critical 
role for SHANK3 in postsynaptic function in cortico-striatal circuitry. 

To assess if the defects arising from Shank3 dysfunction were spe- 
cific to striatal circuitry or due to a more broad CNS perturbation, we 
performed a Morris water maze task for hippocampal-dependent 
learning and memory. We found that Shank3B ‘~ mice performed 
at the same levels as controls in both learning and probe trials 
(Supplementary Fig. 14a-c). Reversal learning and probe trials again 
demonstrated similar levels of performance between Shank3B ‘~ 
mice and controls (Supplementary Fig. 14d-f). Concomitantly, we 
performed electrophysiological recordings from the hippocampal 
CA1 sub-region and found no obvious difference in field recordings 
of population spikes or PPR between genotypes (Supplementary Fig. 
15a-c). In addition, we found no significant differences in mEPSC 
frequency or mEPSC amplitude (Supplementary Fig. 15d-f). These 
data suggest that the observed behavioural and synaptic defects are 
specific to discrete brain regions and are not part of an overall CNS 
dysfunction. 


Discussion 


Despite recent advances in the understanding of autism spectrum 
disorder genetics, the underlying neurobiological substrates and 
neural circuits involved in these disorders remain largely unknown. 
The Shank3 gene has become the focus of substantial interest, with an 
increasing body of evidences suggesting Shank3 as the causative gene 
of the major neurological symptoms in the 22q13 deletion syn- 
drome”’*"***, Our present study with Shank3 mutant mice not only 
sheds light on a critical in vivo role for SHANK3 in striatal glutama- 
tergic synaptic structure and function, but also demonstrates causality 
between a disruption in this gene and the development of autistic-like 
behaviours in mice. 

In this study, we generated two mutant alleles for the Shank3 gene. 
These two lines of mice showed different levels of severity in synaptic 
defects and phenotypes. In humans, multiple mutations/variants of 
Shank3 gene have been identified to coalesce at the ankyrin repeats 
and downstream of PDZ domain*’*. Our data indicate that disrup- 
tions of different locations of the Shank3 gene can lead to varying 
degrees of functional defects, which may in part contribute to pheno- 
typic heterogeneity in Shank3-related ASDs. We should note that, in 
clinical conditions, the 22q13 deletions and the autism-associated 
Shank3 mutations are heterozygous, whereas in our current study, 
we used homozygous mutant mice to get a clear understanding of the 
physiological role of the Shank3 gene and the underlying functional 
consequences of its disruption. Further studies will be needed to 
elucidate potential functional deficits resulting from Shank3 haploin- 
sufficiency in Shank3B ’~ mice. 

PSD-95-SAPAP-SHANK proteins form a key postsynaptic scaf- 
fold at glutamatergic synapses which interacts with many synaptic 
proteins, including the neurexin-neuroligin complex”. In addition to 
Shank3 (ref. 9), it is worth noting that Shank2 (refs 6, 8), SAPAP2 (ref. 
6), neurexin-1 (ref. 26) and neuroligin-3 and -4 (ref. 27) have all been 


ARTICLE 


implicated in human ASDs. Therefore, the dysfunction of neurexin- 
neuroligin-PSD-95-SAPAP-SHANK complex could underlie a com- 
mon synaptic mechanism for a subset of ASDs. 

The precise circuitry defects involved in autistic behaviours are 
poorly understood. Neuroimaging studies provide evidence that caud- 
ate and frontal-striatal circuitries are dysfunctional areas in ASD***°. 
Cortico-striatal circuitry dysfunction has also been strongly implicated 
in repetitive/compulsive behaviours in obsessive-compulsive disorder 
(OCD)****. We previously found that deletion of SAPAP3, which 
directly interacts with SHANK3 and is highly expressed in the stria- 
tum, leads to cortico-striatal circuitry dysfunction and OCD-like beha- 
viours including repetitive/compulsive grooming in mice’. Repetitive 
behaviours are also often seen in autistic patients and in some mouse 
models of ASDs****. SHANK3 is the most abundant SHANK family 
member expressed in the striatum and Shank3B ‘~ mice exhibit 
excessive/repetitive grooming leading to skin lesions. Our data support 
the hypothesis that repetitive behaviours in OCD and ASD may sharea 
common circuitry mechanism. 

The regulation of social behaviours and social interaction is 
thought to be controlled by several brain regions and circuits”’. 
Similarly, genetic makeup is thought to have a key role in the pheno- 
typical manifestation of social behaviours**. The robust social inter- 
action deficits in Shank3B mutant mice demonstrate a casual role for 
the disruption of this gene in the genesis of social dysfunction and 
provide a valuable experimental system for future genetic dissection 
of the neuronal basis of social behaviour. 


METHODS SUMMARY 


Behavioural analysis. Young adult mice 5-6-weeks old were used for all beha- 
vioural analyses except lesion scores which were performed in 4-5-month-old 
mice. All experiments were done blind to genotypes. All experimental procedures 
were reviewed and approved by the Duke University Institutional Animal Care 
and Use Committee and the MIT Committee on Animal Cares. 

Statistical analysis. Analyses were performed using Prism (GraphPad Software) 
and MATLAB (MathWorks). Details on particular tests used are described in the 
main text and in the methods section; a summary of statistical analysis for the 
behavioural data are presented in Supplementary Table 2. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 


Received 7 September 2010; accepted 22 February 2011. 
Published online 20 March 2011. 


1. American Psychiatric Association Task Force on DSM-IV. Diagnostic and statistical 
manual of mental disorders: DSM-IV-TR (American Psychiatric Association, 2000). 

2. Rosenberg, R. E. et al. Characteristics and concordance of autism spectrum 
disorders among 277 twin pairs. Arch. Pediatr. Adolesc. Med. 163, 907-914 
(2009). 

3. Abrahams, B.S. & Geschwind, D. H. Advances in autism genetics: on the threshold 
of a new neurobiology. Nature Rev. Genet. 9, 341-355 (2008). 

4. Bourgeron, T.Asynaptic trek to autism. Curr. Opin. Neurobiol. 19, 231-234 (2009). 

5. Zoghbi, H. Y. Postnatal neurodevelopmental disorders: meeting at the synapse? 
Science 302, 826-830 (2003). 

6. Pinto, D. et al. Functional impact of global rare copy number variation in autism 
spectrum disorders. Nature 466, 368-372 (2010). 

7. Tabuchi, K. etal. A neuroligin-3 mutation implicated in autism increases inhibitory 
synaptic transmission in mice. Science 318, 71-76 (2007). 

8. Berkel, S. et al. Mutations in the SHANK2 synaptic scaffolding gene in autism 
spectrum disorder and mental retardation. Nature Genet. 42, 489-491 (2010). 

9. Durand,C.M. etal. Mutations in the gene encoding the synaptic scaffolding protein 
SHANK3 are associated with autism spectrum disorders. Nature Genet 39, 25-27 
(2006). 

10. Prasad, C. et al. Genetic evaluation of pervasive developmental disorders: the 
terminal 22q13 deletion syndrome may represent a recognizable phenotype. Clin. 
Genet. 57, 103-109 (2000). 

11. Wilson, H. L. et a/. Molecular characterisation of the 22q13 deletion syndrome 
supports the role of haploinsufficiency of SHANK3/PROSAP2 in the major 
neurological symptoms. J. Med. Genet. 40, 575-584 (2003). 

12. Moessner, R. et al. Contribution of SHANK3 mutations to autism spectrum 
disorder. Am. J. Hum. Genet. 81, 1289-1297 (2007). 

13. Gauthier, J. et al. Novel de novo SHANK3 mutation in autistic patients. Am. J. Med. 
Genet. B. Neuropsychiatr. Genet. 150B, 421-424 (2009). 

14. Kim, E. et al. GKAP, a novel synaptic protein that interacts with the guanylate 
kinase-like domain of the PSD-95/SAP90 family of channel clustering molecules. 
J. Cell Biol. 136, 669-678 (1997). 


00 MONTH 2011 |] VOL 000 | NATURE|5 


©2011 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


15. 
16. 
17. 


18. 


19. 
20. 
21. 


22. 
23. 


24. 


25. 
26. 


27. 
28. 


29. 


30. 
31. 
32. 
33. 
34. 


6 | 


Takeuchi, M. et al. SAPAPs. A family of PSD-95/SAP90-associated proteins 
localized at postsynaptic density. J. Biol. Chem. 272, 11943-11951 (1997). 
Zoghbi, H. Y. & Warren, S. T. Neurogenetics: advancing the ‘‘next-generation” of 
brain research. Neuron 68, 165-173 (2010). 

Moy, S. S. etal. Sociabilityand preference for social novelty in five inbred strains: an 
approach to assess autistic-like behavior in mice. Genes Brain Behav. 3, 287-302 
(2004). 

Hung, A. Y. et al. Smaller dendritic spines, weaker synaptic transmission, but 
enhanced spatial learning in mice lacking Shank1. J. Neurosci. 28, 1697-1708 
(2008). 

Redcay, E. & Courchesne, E. When is the brain enlarged in autism? A meta-analysis 
of all brain size reports. Biol. Psychiatry 58, 1-9 (2005). 

Langen, M. et a/. Changes in the developmental trajectories of striatum in autism. 
Biol. Psychiatry 66, 327-333 (2009). 

Hollander, E. et a/. Striatal volume on magnetic resonance imaging and repetitive 
behaviors in autism. Biol. Psychiatry 58, 226-232 (2005). 

Bourgeron, T.A synaptic trek to autism. Curr. Opin. Neurobiol. 19, 231-234 (2009). 
Kwon, C. H. et al. Pten regulates neuronal arborization and social interaction in 
mice. Neuron 50, 377-388 (2006). 

Bonaglia, M. C. et al. Identification of a recurrent breakpoint within the SHANK3 
gene in the 22q13.3 deletion syndrome. J. Med. Genet. 43, 822-828 (2006). 
lrie, M. et al. Binding of neuroligins to PSD-95. Science 277, 1511-1515 (1997). 
Kim, H. G. etal. Disruption of neurexin 1 associated with autism spectrum disorder. 
Am. J. Hum. Genet. 82, 199-207 (2008). 

Jamain, S. eta/. Mutations of the X-linked genes encoding neuroligins NLGN3 and 
NLGN4 are associated with autism. Nature Genet. 34, 27-29 (2003). 

Silk, T. J. et al. Visuospatial processing and the function of prefrontal-parietal 
networks in autism spectrum disorders: a functional MRI study. Am. J. Psychiatry 
163, 1440-1443 (2006). 

Horwitz, B., Rumsey, J. M., Grady, C. L. & Rapoport, S. |. The cerebral metabolic 
landscape in autism. Intercorrelations of regional glucose utilization. Arch. Neurol. 
45, 749-755 (1988). 

Sears, L. L. et al. An MRI study of the basal ganglia in autism. Prog. 
Neuropsychopharmacol. Biol. Psychiatry 23, 613-624 (1999). 

Welch, J. M. et al. Cortico-striatal synaptic defects and OCD-like behaviours in 
Sapap3-mutant mice. Nature 448, 894-900 (2007). 

Shmelkov, S. V. etal. Slitrk5 deficiency impairs corticostriatal circuitry and leads to 
obsessive-compulsive-like behaviors in mice. Nature Med. 16, 598-602 (2010). 
Graybiel, A. M. Habits, rituals, and the evaluative brain. Annu. Rev. Neurosci. 31, 
359-387 (2008). 

McFarlane, H. G. et al. Autism-like behavioral phenotypes in BTBR T+tf/J mice. 
Genes Brain Behav. 7, 152-163 (2008). 


NATURE | VOL 000 | 00 MONTH 2011 


35. Blundell, J. et a/. Neuroligin-1 deletion results in impaired spatial memory and 
increased repetitive behavior. J. Neurosci. 30, 2115-2129 (2010). 

36. Etherton, M. R., Blaiss, C. A., Powell, C. M. & Sudhof, T. C. Mouse neurexin-1o 
deletion causes correlated electrophysiological and behavioral changes 
consistent with cognitive impairments. Proc. Nat! Acad. Sci. USA 106, 
17998-18003 (2009). 

37. Insel, T. R. & Fernald, R. D. How the brain processes social information: searching 
for the social brain. Annu. Rev. Neurosci. 27, 697-722 (2004). 

38. Ebstein, R. P., Israel, S., Chew, S. H., Zhong, S. & Knafo, A. Genetics of human social 
behavior. Neuron 65, 831-844 (2010). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements We thank C. Duarte, S. Chaterjee and A. Oliveira-Maia for 
discussions; L. Kruger and Q. Liu for technical assistance; A. Hadiono for assistance in 
behavioural annotation; D. Bredt for the PSD-93 antibody; T. Boeckers for the 
anti-SHANK3 antibody; S. Miller and P. Christopher for advice and assistance with 
electron microscopy techniques; J. Crawley for the demonstration of social behaviour 
tests; N. Calakos and Y. Wan for advice on electrophysiology studies; A. Graybiel for 
critical comments of the manuscript; D. Wang and the other members of the G.F. 
laboratory for their support. We thank The Poitras Center for Affective Disorders 
Research. This work was funded by a grant from NIMH/NIH (RO1MH081201), a 
Hartwell Individual Biomedical Research Award from The Hartwell Foundation, anda 
Simons Foundation Autism Research Initiative (SFARI) grant Award to G.F.; a NARSAD 
Young Investigator Award and NIH Ruth L. Kirschstein National Research Service 
Award (F32MH084460) to J.T.T.; a NIH (RO3MH085224) grant to Z.F.; and doctoral 
fellowships from the Portuguese Foundation for Science and Technology to J.P. (SFRH/ 
BD/15231/2004) and C.F. (SFRH/BD/15855/2005). C.F. would like to acknowledge 
the support from the “Programa Gulbenkian de Doutoramento em Biomedicina”’ 
(PGDB, Oeiras, Portugal) and J.P. the “Programa Doutoral em Biologia Experimental e 
Biomedicina’”’ (CNC, Coimbra, Portugal). 


Author Contributions J.P., C.F., J.T.T., W.W., M.F.W., T.N.V., C.D.L. and Z.F. participated in 
he execution and analysis of experiments. J.P., C.F., J.T.T., C.D.L, Z.F. and G.F 
participated in the interpretation of the results. J.P., C.F. and G.F. designed the 
experiments and wrote the paper. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of this article at 
www.nature.com/nature. Correspondence and requests for materials should be 
addressed to G.F. (fengg@mit.edu). 


©2011 Macmillan Publishers Limited. All rights reserved 


METHODS 


Mice. Shank3 mutant mice were generated by homologous recombination in R1 
embryonic stem cells and implanted in C57 blastocysts using standard proce- 
dures. One targeting vector (Shank3A) was designed to replace exon 4-7 (con- 
taining the ankyrin repeat domains) and another vector (Shank3B) was designed 
to replace exon 13-16 (containing the PDZ domain) of the Shank3 gene with a 
NEO cassette. Genotypes were determined by PCR of mouse tail DNA, using: for 
Shank3A, primer Fla (GGTTGAGGATGAGCAAGCTAG) and Rla (GGGAC 
ATAAGTGAAGGTTAGG) for the wild-type allele (318 base pairs), and Fla and 
R2 (TCAGGGTTATTGTCTCATGAGC; in the neo cassette) for the mutant 
allele (361 base pairs); for Shank3B, primer Flb (GAGCTCTACTCCCTT 
AGGACTT) and R1lb (TCCCCCTTTCACTGGACACCC) for the wild-type 
allele (316 base pairs), and Flb and R2 (TCAGGGTTATTGTCTCATGAGC; 
in the neo cassette) for the mutant allele (360 base pairs). The NEO cassette 
was not removed. 

Chimaeric mice were crossed to C57 females (Jackson Labs). Initially, F1 
hybrids from heterozygous < heterozygous matings were generated. However, 
homozygous knockouts mice from this type of mating are smaller than their wild- 
type littermates, presumably due to an inadequate competition for resources 
during early postnatal days leading to different developmental trajectories. We 
postulated that this size difference would influence our behavioural tests. To 
alleviate this confound, heterozygous animals were crossed in direct brother- 
sister matings for five generations from which we derived F5 isogenic hybrids 
in a mixed background. These isogenic animals were then used to generate time- 
mated homozygous X homozygous breeding pairs to obtain wild-type and 
mutant animals used in the experiments. F5 Shank3A and F5 Shank3B knockouts 
from these matings are reared to weaning age with weights similar to those from 
F5 control animals. 

Animals were housed at a constant 23 °C ina 12h light/dark cycle (lights off at 
19:00), with food and water available ad libitum. Mice were housed 3-5 by 
genotype per cage with the exception of the animals individually housed for 
grooming measurements. Only aged-matched male mice were used for beha- 
vioural experiments, all other tests included age-matched males and females in 
proportional contribution across groups. Unless otherwise noted, all tests were 
conducted with naive cohorts of mice. All experimental procedures were 
reviewed and approved by the Duke University Institutional Animal Care and 
Use Committee and the MIT Committee on Animal Cares.. 

Grooming behaviour’. Young adult male mice 5-6-week-old were used for 
analysis of grooming behaviour. Habituated, individually housed animals were 
video-taped for 24h under 700 Ix (day, 12h) and ~2 1x (red light at night, 12 h) 
illumination. Grooming behaviours were coded from 19:00-21:00 h (that is, 2h 
beginning at the initiation of the dark cycle); this segment was analysed using 
Noldus Observer software and the total amount of time in the 2-h segment spent 
grooming was determined. Grooming included all sequences of face-wiping, 
scratching/rubbing of head and ears, and full-body grooming. The observer 
was blinded to genotype during the scoring of the videotapes. 

PSD preparation and western blot. PSD fractions of the striatum were prepared 
as previously described*’, separated on SDS-PAGE and probed with specific 
antibodies. The relative amount of f-actin was used as loading control. 
Antibodies used in these experiments include rabbit antibodies against PSD-93 
(gift from D. Bredt) and Shank3 (gift from T. Boeckers). The antibody for SAPAP3 
has been previously described”. Commercial antibodies used include monoclonal 
antibodies against NR1 (Transduction Laboratories), NR2B (Millipore), CaMKII 
(Transduction Laboratories), NR2A (Millipore), and B-actin (Sigma), as well as 
polyclonal antibodies against GluR1 (Abcam), Homer (Chemicon), GluR2 
(Abcam), neuroligin-3 (Synaptic Systems) and PSD-95 (Abcam). 

In situ hybridization. mRNA in situ hybridization was performed as described 
elsewhere”. Briefly, reactions were performed with 20m cryosections from 
freshly frozen 5-week-old brain mouse tissue using digoxigenin (DIG)-labelled 
riboprobes against mouse Shank1 cDNA (NM_001034115; base pairs 4107- 
4924), Shank2 cDNA (NM_001081370; base pairs 2063-2876) and Shank3 
(NM_021423; base pairs 3159-3959). The complementary DNAs used were all 
verified by sequencing compared to the following sequences GenBank accession 
numbers: (Shank1: NM_001034115), (Shank2: NM_001081370) and (Shank3: 
NM_021423). The hybridization signal was detected using an alkaline phospha- 
tase (AP)-conjugated anti-DIG antibody (Roche) and developed using 5-bromo- 
4-cloro-indolylphosphate/nitroblue tetrazolium (BCIP/NBT; Roche). 

Motor and anxiety-like behaviours’'. Zero maze: an elevated zero maze was 
indirectly illuminated at 100 lx. Testing commenced with an animal being intro- 
duced into a closed area of the maze. Behaviour was video-taped for 5 min and 
subsequently scored by a trained observer using Noldus Observer software. 
Anxiety-like behaviour was deduced based upon the percent time spent in the 
open areas. The observer was blinded to genotype. The animals used in the zero 
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maze test, both Shank3A ’~, Shank3B ’~ and respective controls were previ- 
ously tested in the open field test with a 2-day period in between tasks. 

Open field: spontaneous locomotor activity was evaluated over 30 min in an 
automated Omnitech Digiscan apparatus (AccuScan Instruments) as described*'. 
Locomotor activity was assessed as total distance travelled (m). Anxiety-like 
behaviour was defined by number of rearings and time spent in the centre as 
compared to time spent in the perimeter (thigmotaxis) of the open field. 
Dark-light emergence test: mice were habituated in an adjacent room to low light 
conditions (~40 lx) and the test room was initially under similar illumination. 
Testing was conducted in a two-chambered test apparatus (Med Associates), with 
one side draped in black cloth (that is, dark-chamber) and the other illuminated at 
~1,000 Ix (that is, light-chamber) with a high intensity house light and overhead 
fluorescent lamps. Upon placing the mice into the dark chamber, the light chamber 
was illuminated and the door between the two chambers was opened. The mice 
were allowed to freely explore the apparatus for 5 min. The latency to emerge from 
the darkened into the lighted chamber and the percentage of time spent in the 
illuminated chamber were used as indices of anxiety-like behaviours. 

Social interaction paradigm. Three-chamber social test: sociability and response 
to social novelty test was performed as previously described’” with minor modi- 
fications. Briefly, 5-6-week-old male animals were used across all tests. Target 
subjects (Stranger 1 and Stranger 2) were 5-6-week-old males habituated to being 
placed inside wire cages for 5 days before beginning of testing. Test mice were 
habituated to the testing room for at least 45 min before the start of behavioural 
tasks. The social test apparatus consisted of a transparent acrylic box with remov- 
able floor and partitions dividing the box into three chambers. Here, the middle 
chamber (20 cm X 17.5 cm) is half the width of Chamber 1 (20 cm X 35 cm) and 
Chamber 2 (20cm X 35 cm) with the overall dimensions of the box being 60 cm 
(length) < 35 cm (width) with 5 cm openings between each chamber which can 
be closed or open with a lever operated door. The wire cages used to contain the 
stranger mice were cylindrical, 11 cm in height, a bottom diameter of 10.5 cm with 
the bars spaced 1cm apart (Galaxy Cup, Spectrum Diversified Designs). An 
inverted transparent cup was placed on the top of the cage to prevent the test 
mice from climbing on the top of the wire cage. 

For the sociability test, the test animal was introduced to the middle chamber 
and left to habituate for 5 min, after which an unfamiliar mouse (Stranger 1) is 
introduced into a wire cage in one of the side-chambers and an empty wire cage 
on the other side-chamber. The dividers are then raised and the test animal is 
allowed to freely explore all three chambers over a 5 min session. Following the 
5 min session, the animal remains in the chamber for an extra 5 min (post-test) to 
better acquire the identification cues from Stranger 1 animal. Following this, a 
novel stranger mouse (Stranger 2) is inserted in the wire cage previously empty 
and again the test animal is left to explore for a 5 min session. Time spent in each 
chamber, time spent in close proximity and heat maps were calculated using the 
automated software Noldus Ethovison. The release of the animals and relative 
position of social and inanimate targets was counterbalanced. However, for each 
individual test animal the location of Stranger 1 was maintained during Stranger 
1-E and Stranger 1 - Stranger 2 testing of the social behaviour. 

Dyadic social interaction: animals were acclimatized to the test room for at least 
1h before the experiment. Target mice were wild-type and Shank3B ’~ of 
6 weeks of age. Stimulus mice were conspecific age-matched wild-type mice 
socially naive to the target mice. At least 3h before the beginning of the test, 
stimulus mice were given identifiable markings on the tails using a black marker 
pen. A pair of target and stimulus mice were introduced in a transparent Plexiglas 
arena (40cm X 40cm X 30cm) covered with fresh bedding and the session 
recorded for 10 min. Quantification of social behaviours was performed using 
Noldus Observer software by a researcher blinded to the genotype of the target 
animals. Quantifications included: reciprocal social interaction, as determined by 
any sequence or combination of sequences involving close huddling, sniffing (for 
example, nose-to-nose, anogenital sniffing) or allogrooming by the target and 
stimulus mouse; the frequency of nose-to-nose sniffing; and the frequency of 
anogenital sniffing initiated by the target animal towards the stimulus mouse. 
Statistical tests were performed using unpaired two-tailed t-test. 

Rotarod. Motor coordination was assessed in an accelerating rotarod test (4- 
40 r.p.m.). Briefly, animals were introduced in the apparatus (Med Associates) 
and the latency to fall was determined. Animals were tested for three trials in a 
single day with an inter-trial interval of 10 min. 

Morris water maze. Morris water maze testing was conducted as describe else- 
where” with minor modifications. Male mice (4—5-weeks old) selected for the test 
were individually handled daily for 5 days before beginning the experiment. 
Testing pool was 120cm in diameter and the platform 8 cm in diameter. The 
platform was submerged 1 cm below the water surface. Pool water was main- 
tained at 23.0 + 0.5 °C and made opaque by mixing-in white non-toxic tempera 
paint. During training, 90s duration trials were used, if the animals did not find 
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the platform within 90s the experimenter guided the animal to the platform. 
After reaching the platform the animals were left for 15s on top of the platform 
before being removed. Trials were administered for 5 days with four trials per 
animal per day with the platform located in the south-west quadrant. On the sixth 
day a 60s probe trial was performed. On the seventh day, the reversal training 
commenced with the platform in the north-east quadrant, and proceeded as 
described above. The experimenter followed the animals’ progress using tracking 
software outside of the testing room. Tracking and analysis were performed using 
the Noldus Ethovison software. 

Golgi staining and Sholl analysis. All brains and collected sections were coded in 
order to blind the experimenter of the genotype until after all data was collected 
and analysed. Brains from 5-week-old, gender-matched Shank3B ‘~ and control 
mice were prepared using standard Golgi-Cox impregnation technique using the 
ED Rapid GolgiStain Kit (NeuroTechnologies). Serial coronal sections of 100 jim 
were collected from controls and Shank3B mutant animals. A total of 12 cells per 
animal were traced across the dorsal striatum as to sample representatively from 
this structure for a final number of 36 cells per genotype. For each animal, sections 
were selected to be between rostral-caudal bregma 1.18mm and 0.86 mm. 
Criteria to identify medium spiny neurons were, (1) presence within the caudate 
putamen; (2) full impregnation of the neuron along the entire length of the 
dendritic arborization; (3) relative non-overlap with surrounding neurons and 
isolation from astrocytes and blood vessels and (4) morphologically, by the pres- 
ence of high number of spines and relatively short neuronal arborizations as 
characteristics of MSNs. For each selected neuron the entire neuronal arbor 
was reconstructed under a X 100 oil lens in a motorized microscope with a digital 
CCD camera connected to a computer running Neurolucida Software (MBF 
Bioscience). The three-dimensional analysis of the reconstructed neurons was 
performed using NeuroExplorer software (MBF Bioscience) and data from 
branch length, number of branches and neuronal complexity was measured 
and analysed in Prism (Graph Pad). Two-way repeated measures ANOVA was 
used for Sholl analysis. Statistical significance was accepted when *P < 0.05, 
**P < 0.01 and ***P < 0.0001. 

Cortico-striatal electrophysiology. Brain slice preparation for extracellular field 
recording: acute brain slices were prepared from 6-7-week-old mice. Slices were 
prepared from one WT and one KO pair each day and the experimenter was blinded 
to the genotype. The mice were deeply anesthetized by intra-peritoneal injection of 
avertin and then transcardially perfused with carbogenated (95% O2, 5% COz) ice- 
cold protective cutting artificial cerebrospinal fluid (aCSF) with the composition (in 
mM): 119 glycerol, 2.5 KCl, 1.25 NaH2PO4, 26 NaHCOs, 25 glucose, 2 thiourea, 5 
L-ascorbic acid, 3 Na-pyruvate, 0.5 CaCl,.4H2O, 10 MgSO4.7H20. Mice were then 
decapitated and the brains were removed into ice-cold cutting solution for an 
additional 1 min. The brains were then rapidly blocked for coronal sectioning at 
300-1m thickness on a VF200 model compresstome (Precisionary Instruments) 
using either a sapphire or zirconium ceramic injector style blade. Slices containing 
the dorsal striatum were initially recovered for 30 min at room temperature (23- 
25 °C) ina carbogenated protective recovery aCSF (same composition as the cutting 
aCSF except that glycerol was replaced with N-methyl-p-glucamine (NMDG)-Cl as 
a substitute for NaC] to prevent initial excitotoxic swelling during re-warming). 
After this initial 30 min period the slices were transferred into a holding chamber 
containing carbogenated normal aCSF of the composition (in mM): 119 NaCl, 2.5 
KCL, 1.25 NaH»POq, 26 NaHCOs, 12.5 glucose, 2 CaCl;.4H20, 1 MgSO4.7H20. The 
holding aCSF was supplemented with (in mM): 2 thiourea, 1 L-ascorbic acid, 3 Na- 
pyruvate to improve slice health and longevity, and slices were stored for 1-6 h before 
transfer to the recording chamber for use. The osmolarity of all solutions was mea- 
sured at 300-310 mOsm and the pH was maintained at ~7.3 after equilibration 
under constant carbogenation. 

Supplementary Fig. 11 shows summary data for corticostriatal field recordings 
from acute coronal brain slices of Shank3A mutant versus WT mice. The method 
of slice preparation differed significantly in these earlier experiments. Mice were 
transcardially perfused with carbogenated ice-cold protective sucrose aCSF with 
the composition (in mM): 185 sucrose, 2.5 KCI, 1.25 NaH2PO,, 26 NaHCOs, 25 
glucose, 0.5 CaCl,.4H,0, 4 MgSO,.7H,0 (pH 7.3, 300-310 mOsm) without sup- 
plementation of antioxidants. Slices were immediately transferred into a holding 
chamber containing carbogenated normal aCSF of the composition (in mM): 119 
NaCl, 2.5 KCl, 1.25 NaH,PO,, 26 NaHCOs, 12.5 glucose, 2 CaCl,.4H,O, 1 
MgSO,..7H20 (pH7.3, 300-310 mOsm) without supplementation of antioxi- 
dants, and slices were stored for 1-4h before transfer to the recording chamber. 
The absence of the initial 30 min recovery period in ‘protective’ aCSF in addition 
to the absence of antioxidant supplementation in the cutting aCSF and in the 
aCSF in the holding chamber results in more rapid deterioration of slice health 
and smaller evoked population spike amplitudes on average, indicating reduced 
overall slice viability compared to slices prepared with a 30 min NMDG aCSF 
recovery protocol described above. However, WT and KO brain slices were 


always subjected to identical procedures on any given day of recording and the 
procedures were always standardized for each discrete experimental data set so 
that these factors would not introduce any potential confounds. 

Extracellular field recording. A platinum iridium concentric bipolar stimulating 
electrode (CBAPC75, 25 1m inner pole diameter; FHC) was placed on the inner 
border of the corpus callosum between the cortex and dorsolateral striatum. This 
electrode position was chosen to predominantly activate cortical axons within the 
corpus callosum which heavily converge upon striatal MSNs to form excitatory 
corticostriatal synaptic connections. Although there is ample evidence on which 
to base our assertion that stimulation of the corpus callosum predominantly 
results in activation of cortical axons*'’, we are unable to exclude the possibility 
of a relatively smaller contribution arising from activation of thalamostriatal 
axons that have distal terminals in dorsolateral striatum nearby to the stimulated 
region. Thus, although we refer to our measurements as primarily reflecting 
corticostriatal transmission, our measurements are not ‘pure’ corticostriatal res- 
ponses. Borosilicate glass recording electrodes filled with 2 M NaCl were placed in 
the dorsolateral striatum approximately 400-450 j1m away from the stimulating 
electrode. Corticostriatal field population spikes were evoked with 0.15 ms step 
depolarizations at 0.5 mA intensity at a frequency of 0.05-0.1 Hz. Paired pulses were 
evoked with a 100 ms inter-stimulus interval. Baseline responses were monitored to 
ensure stable population spike amplitude for a minimum of 5 min. Input-output 
functions were then determined for the negative peak 1 (NP1; presynaptic fibre 
volley) and population spike amplitude by three consecutive rounds of stimulation 
from 0-1.0mA in 0.1mA increments. All recordings were performed at room 
temperature and acquired using pCLAMP 10 software (Axon Instruments/ 
Molecular Devices). Data analysis was performed blind to genotype in Clamp fit 
(Axon Instruments/Molecular Devices). Population spike amplitude was measured 
as the average of the early peak positivity to the peak negativity and from the peak 
negativity to the late peak positivity. This standard method takes into account the 
fact that the downward population spike is superimposed on an upward field 
excitatory postsynaptic potential (fEPSP). Paired pulse ratio (PPR) was calculated 
as the ratio of the 2nd population spike amplitude to the Ist population spike 
amplitude for responses to paired pulse stimulation at 0.5 mA fixed intensity with 
a 100 ms inter-stimulus interval for the pair. 

Extracellular field recordings and whole-cell mEPSC recordings in the hippo- 
campal CA1 region were conducted in 300 um thick acute brain slices from 6-9- 
week-old WT and Shank3B mutant mice. For measurement of hippocampal CA1 
population spikes, a concentric bipolar stimulating electrode was placed in the 
stratum radiatum to stimulate the Schaffer collateral pathway, and a borosilicate 
glass recording electrode (~2-3 MQ) filled with recording aCSF was placed in the 
CAI pyramidal cell layer approximately 400 um from the stimulation site. The 
recording electrode was placed at the depth in the slice that gave the largest 
population spike amplitude, and a stable baseline was established for <10 min. 
Input-output recordings were conducted by increasing the stimulation intensity 
from 0 to 160 pA in 20 LA increments. Three successive rounds were collected 
and values at each intensity represent the average of the three measurements. CA1 
population spike amplitude was quantified exactly as described previously for 
cortico-striatal population spikes. For CA1 pyramidal neuron whole-cell recordings, 
pyramidal neurons in CA1 were identified under infrared-differential interference 
contrast (IR-DIC) visualization. Cells were patched with a Caesium-gluconate-based 
internal solution containing (in mM): 110 Caesium-gluconate, 15 KCl, 4 NaCl, 5 
TEA-Cl, 20 HEPES, 0.2 EGTA, 5 lidocaine N-ethy] chloride, 4 ATP magnesium salt, 
and 0.3 GTP sodium salt. The pH was adjusted to 7.25 with p-gluconic acid and 
osmolarity was adjusted to 290-300 mOsm with sucrose as necessary. The recording 
aCSF contained 1M TTX, 100M picrotoxin, 5M CGP55845, and 501M 
D-APV to isolate pure AMPAR-mediated mEPSCs. CA1 neurons were voltage- 
clamped at —80 mV to amplify the smallest spontaneous miniature synaptic events 
that might otherwise escape detection. Criteria for acceptance were uncompensated 
stable R, < 25 MQ and holding current <—300 pA. mEPSCs were detected using 
MiniAnalysis software (Synaptosoft) as described for striatal MSNs. All recordings 
were carried out at room temperature (23-25 °C). Slices were prepared in a 20-30 
degree off-horizontal cutting angle (optimal for CA1 region) from one WT and one 
KO pair each day and the experimenter was blind to the genotypes of the animals. 
Striatal slice preparation for whole-cell recording. Mice 5-6-week-old were 
used for all whole-cell electrophysiology procedures by an experimentalist 
blinded to genotype. Acute coronal striatal slices were prepared as follows. 
Briefly, mice were anesthetized with Avertin solution (20 mg/ml, 0.5 mg/g body 
weight) and perfused through the heart with a small volume (about 20 ml) of ice- 
cold and oxygenated (95% O3, 5% CO.) cutting solution containing (mM): 105 
NMDG, 105 HCl, 2.5 KCl, 1.2 NaH2PO4, 26 NaHCOs3, 25 Glucose, 10 MgSOu, 0.5 
CaCl, 5 L-Ascorbic Acid, 3 Sodium Pyruvate, 2 Thiourea (pH 7.4, with osmo- 
larity of 295-305 mOsm). The brains were rapidly removed and placed in ice-cold 
and oxygenated cutting solution. The coronal slices (300 um) were prepared 
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using a slicer (Vibratome 1000 Plus, Leica Microsystems, USA) and then trans- 
ferred to an incubation chamber (BSK4, Scientific System Design Inc., USA) at 
32°C with carbogenated cutting solution, which was gradually replaced with 
aCSF in 30min through a peristaltic pump (Rainin, RP-1) allowing a precise 
regulation of flowing rates. The slices were then kept in the aCSF that contained 
(mM): 119 NaCl, 2.3 KCl, 1.0 NaHPO4, 26 NaHCOs, 11Glucose, 1.3 MgSO4, 2.5 
CaCl, (pH was adjusted to 7.4 with HCl, with osmolarity of 295-305 mOsm) at 
room temperature for at least 30 min. 

Whole-cell patch-clamp. The slice was placed in a recording chamber (RC-27L, 
Warner Instruments) and constantly perfused with oxygenated aCSF at 24°C 
(TC-324B, Warner Instruments) at a rate of 1.5-2.0 ml min !. The striatum and 
individual MSNs were visualized and identified with a microscope equipped with 
IR-DIC optics (BX-51WI, Olympus) by location, shape and size (ovoid cell body 
with major axis of 10 to 14m). Two additional measures were used to distin- 
guish them from similar sized GABAergic interneurons. First, GABAergic inter- 
neurons show smaller membrane capacitance (C,,,) and membrane time constant 
(tm) (at least two times less) when compared to that of MSNs. In the case of 
recordings done with Cs* internal, these membrane properties were measured 
immediately after membrane rupture when the Cs* internal has not been dia- 
lysed and taken effect yet. Second, AMPA receptor-mediated mEPSCs showed 
much faster kinetics (including both rise time and decay time constant, t decay) in 
GABAergic interneurons. Whole-cell patch-clamp recordings were obtained 
from MSNs using recording pipettes (King Precision Glass, glass type 8250) 
pulled in a horizontal pipette puller (P-87, Sutter Instruments) to a resistance 
of 3-4 MQ, when filled with the internal solution containing (in mM): 107 
CsMeSO3, 10 CsCl, 3.7 NaCl, 5 TEA-Cl, 20 HEPES, 0.2 EGTA, 5 lidocaine 
N-ethyl chloride, 4 ATP magnesium salt, and 0.3 GTP sodium salt. pH was 
adjusted to 7.3 with KOH and osmolarity was adjusted to 298-300 mOsm with 
15mM K,S04. 

To record AMPA receptor-miniature excitatory postsynaptic currents (mEPSCs), 
the cells were held in voltage clamp at —70 mV in the presence of 50 1M APV (DL-2- 
amino-5-phosphono-valeric acid), 25 uM BMR (1(S),9(R)-(—)-bicuculline meth- 
bromide), 10 1M p-serine and 1 uM TTX (all from Tocris). The miniature events 
were not recorded until 5 min after entering whole cell patch clamp recording mode 
to allow the dialysis of Cs* internal solution for a relatively complete block of the 
potassium channels in the MSNs. The mEPSCs were detected and analysed with 
MiniAnalysis (Synaptosoft). 

For paired-pulse stimulation experiments, AMPAR mediated excitatory post- 
synaptic currents (EPSCs) were evoked by a local concentric bipolar stimulating 
electrode (CBARC75, FHC) that was placed in the inner edge of corpus callosum 
within the dorso-lateral region of the striatum. Recordings were made in the 
presence of picrotoxin (100M) and APV (501M) to block activation of 
GABA, receptors and NMDA receptors. Stimulation was current-controlled 
(ISO-Flex, A.M.P.I.). The stimulus intensity was set at a level that could evoke 
300-400 pA AMPAR-mediated response for all the cells measured and delivered 
with an inter-stimulus interval of 50 ms. The paired-pulse measurements were 
obtained for 15-20 consecutive traces and only those traces with stable evoked 
first current response were used for data analysis. The PPR was calculated with the 
peak current response to the second pulse divided by that of the first response. 

NMDAR- and AMPAR- mediated synaptic current ratio (NMDA/AMPA 
ratio) was recorded in the presence of picrotoxin at holding potentials of 
+40 mV and —70 mV, respectively. The NMDA/AMPA ratios were measured 
according to previously described methods”. Briefly, the stimulus intensity was 
set at a level that could evoke 300-400 pA AMPAR-mediated response with a 
holding potential at —70 mV. Each evoked response was repeated for 15-20 times 
with an inter-stimulus interval of 20 s for all the cells measured. The time point of 
the peak current at -70 mV, considered to be fully mediated by AMPARs, was 
used to establish the time window for measuring the AMPA peak at +40 mV. The 
decay to baseline of the AMPA current at -70mV was used to select a time 
window for measurement of the NMDA current; a 10-ms measurement window 
beginning 40 ms after the stimulus artefact was used. This current amplitude at 
this point was designated as the NMDAR mediated synaptic current response. 
Uxmpa at +40 mV/I,mpa at -70 mV) was taken as the NUDA/AMPA ratio. 
Data acquisition and analysis. A Multiclamp 700B amplifier (Molecular Devices 
Corporation) and digidata 1440A were used to acquire whole cell signals. The 
signals were acquired at 20 kHz and filtered at 2 kHz. The series-resistance was 
<20 MQ. Values are expressed as means + s.e.m. Data were tested for signifi- 
cance using either an unpaired t-test or a two-way repeated measures ANOVA. 
Cell filling. Mice were assigned a code previous to dissection, as to maintain a 
blinded genotype across all procedures, including dissection, cell filling, imaging 
and quantification. Mice were deeply anesthetized with an overdose of isofluor- 
ane and transcardially perfused with PBS (pH 7.4) followed by ice-cold 4% para- 
formaldehyde/PBS (PFA) (pH 7.4). The brains were removed and post-fixed 
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overnight in PFA 4%. After post-fixation, the brain was sliced at 200-1m thickness 
coronal sections in a vibratome and kept in PBS at 4 °C. For cell filling injections, 
selected brain slices immersed in PBS were mounted in a tissue stage. Dorsal 
striatal medium cells were targeted with post hoc confirmation of being medium 
spiny neurons (morphology and spine density). Using a micromanipulator, micro- 
pipettes loaded with Lucifer Yellow dye (Sigma L-0259, 8% solution in 0.05 M Tris 
buffer, pH 7.4) were used to impale the cell body. A micropipette containing a 
solution of 0.1 M LiCl was used to deliver the dye with a continuous 10 nA current 
for 5 min. Following cell filling, a post-staining was used to amplify the fluorescent 
signal. Briefly, sections were transferred to blocking solution (5% sucrose, 2% BSA, 
and 1% Triton X-100 in PBS) containing 1:500 rabbit anti-Lucifer Yellow antibody 
(Invitrogen A5750) and incubated gently for 3 days at 4°C. Sections were washed 
three times for 5 min in blocking solution and incubated 2 h at room temperature 
with 1:400 biotinylated goat anti-rabbit antibody (Vector Laboratories BA-1000). 
Next, sections were washed three times for 5 min in PBS. A tertiary incubation was 
performed by incubating sections for 2h at room temperature in streptavidin- 
conjugated Alexa 488 (Invitrogen $11223) diluted 1:1,000 in PBS. Finally, 
sections were washed three times in PBS, mounted on slides using Fluoro-Gel 
(EMS, 17985-10) and imaged by confocal microscopy. Spine density was calcu- 
lated automatically using NeuronStudio (Mount Sinai School of Medicine) and 
manually curated by an observer using a three-dimensional analysis of the dend- 
ritic image stack. All spine counts began 30 jtm away from the outer edge of the 
soma and extended for an additional 10-60 um away from the starting point. The 
data from spine density passed the Lilliefors normality test and D'Agostino & 
Pearson omnibus normality test. Spine metrics relating to spine length, spine neck 
diameter and spine neck width were collected using ImageJ (NIH). All analyses of 
spine metrics were performed by observers that were blinded to the genotypes of 
the animals. 

Electron microscopy. Mice were assigned a code previous to dissection, as to 
maintain a blinded genotype across all procedures, including dissection, sample 
processing, imaging and quantification. Mice were deeply anesthetized with an 
overdose of isofluorane and transcardially perfused with PBS (pH 7.4) followed 
by ice-cold 4% paraformaldehyde (PFA) in phosphate buffer (pH 7.4). The brains 
were removed, the striatum dissected and post-fixed overnight in PFA 4%, then 
transferred into a 4% glutaraldehyde solution and kept at 4°C for 3 days. The 
samples were washed twice, 20 min each, in 7.5% sucrose, 0.1 M sodium cacody- 
late buffer, then post-fixed in 1% osmium tetroxide for 2 h with initial microwave 
treatment for 6 min. Next, the samples were washed twice in 0.11 M veronal 
acetate buffer for 20 min each. Following en-block staining in 1% uranyl acetate 
in distilled water for 1 h the samples were washed twice in 0.11 M veronal acetate 
buffer for 20 min each. Samples were dehydrated using serial dilutions of ethanol 
(70%, 95%, 2X 100%) for 20 min each, with initial microwave treatment of 2 min. 
Samples were then treated for 20 min twice with propaline oxide and impregnated 
with 50:50 propaline oxide:Epon resin overnight at 4°C, with initial microwave 
treatment for 3 min. Next, the samples were impregnated with 100% Epon resin, 
three changes of 2 h each, with initial microwave treatment for 3 min each. Tissue 
samples were embedded in moulds and incubated for 48 h at 60 °C. Afterwards, 
semi-thin sections (0.5 tm) were cut on a Leica UltraCut S ultramicrotome and 
stained with Toluidine (0.8%) stain. From these, thin striatal sections (70 nm) 
were cut on an UltraCut S, mounted on 200 mesh Metaxaform Copper Rhodium 
grids and post-stained in 2% uranyl acetate in distilled water for 15 min and Sato’s 
Lead citrate stain for 7 min. Grids were examined on a Philips (FEI) CM 12 
transmission electron microscope. Images were acquired at 40,000 magnifica- 
tion using an AMT 2Vue system, with an ORCA HR High resolution digital 
camera 7 megapixels, a Hamamatsu DCAM board for acquisition and AMT 
Image Capture Engine software version 600.335f. Images were saved as 7.5 mega- 
pixels 8 bit TIFF format files. PSD measurements were performed using Image] 
(NIH) by an observer that was blinded to the genotype of the samples. 
Magnetic resonance image acquisition. Animals were assigned a blinding code, 
which was maintained during magnetic resonance (MR) data acquisition and 
analysis. MR mouse brain imaging was performed on a 7T Bruker Biospec 70/ 
30 horizontal bore system (Billerica). Animals were lightly anesthetized under 
isofluorane with continuous monitoring and maintenance of physiological para- 
meters throughout the imaging session (~60 min for each animal). Axial two- 
dimensional T2-weighted fast spin echo images (TURBO-RARE, TE/TR= 11/ 
4,200 ms with 1 mm slice thick, matrix = 256 X 256 and FOV of 2.4cm X 2.4m, 
five averages, 0.0mm interslice gap) images were first obtained for screening 
purposes and supplemental anatomic information. For directed striatal and brain 
volumetric analysis, 64 contiguous 500-j1m thick three-dimensional FSE proton 
density images (TURBO-RARE, TE/TR = 9/1,500 ms, matrix = 256 X 256 x 64 
and FOV 2.2 cm X 2.2 cm X 2.2 cm, 25 min duration) were acquired. 

MR volumetry measurements. Volumetric analysis of MR data sets was per- 
formed in OsiriX software, an open source image processing application 
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developed and maintained by Pixmeo. The left caudate-putamen was manually 
segmented in each animal by an investigator blinded to genotype. Each caudate- 
putamen was traced on contiguous axial slices from the three-dimensional 
volume acquisition with reference to a high-resolution age-matched mouse brain 
atlas (The Mouse Brain Atlas, The Mouse Brain Library at http://www.mbLorg). 
Selected areas were reviewed for consistency on coronal and sagittal representa- 
tions, and cross-correlated with axial two-dimensional FSE images. Volumes 
were computed within OsiriX. Two separate striatal segmentations were obtained 
for each animal, with the average volume then taken. Intrarater reliability (kappa 
value) was = 0.97. 

Volume normalization. Unilateral caudate-putamen volumes were normalized 
to brain volume measurements obtained from the same three-dimensional 
volume sets. Because of susceptibility distortions within the posterior fossa, 
and variable inclusion from animal to animal of posterior fossa contents at the 
caudal end of the 64 slice three-dimensional volume set, ‘whole’ brain volumes for 
normalization were obtained in each animal from traces beginning rostrally at the 


olfactory bulbs and ending caudally through the cerebral aqueduct at the roof of 
the fourth ventricle. As with striatal volumes, brain volumes were computed in 
OsiriX from the average of two segmentations. Intrarater reliability (kappa value) 
was >0.99. Statistical analysis was performed with unpaired two-tailed t-test. 
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Loss-of-function mutations in sodium 
channel Na,1.7 cause anosmia 


Jan Weiss!*, Martina Pyrski!*, Eric Jacobi!, Bernd Bufe!, Vivienne Willnecker?, Bernhard Schick’, Philippe Zizzari’, 
Samuel J. Gossage*, Charles A. Greer®, Trese Leinders-Zufall', C. Geoffrey Woods’, John N. Wood*’” & Frank Zufall* 


Loss of function of the gene SCN9A, encoding the voltage-gated sodium channel Na,1.7, causes a congenital inability to 
experience pain in humans. Here we show that Na,1.7 is not only necessary for pain sensation but is also an essential 
requirement for odour perception in both mice and humans. We examined human patients with loss-of-function 
mutations in SCN9A and show that they are unable to sense odours. To establish the essential role of Nay1.7 in odour 
perception, we generated conditional null mice in which Na,1.7 was removed from all olfactory sensory neurons. In the 
absence of Na,1.7, these neurons still produce odour-evoked action potentials but fail to initiate synaptic signalling from 
their axon terminals at the first synapse in the olfactory system. The mutant mice no longer display vital, odour-guided 
behaviours such as innate odour recognition and avoidance, short-term odour learning, and maternal pup retrieval. Our 
study creates a mouse model of congenital general anosmia and provides new strategies to explore the genetic basis of the 


human sense of smell. 


The inability to sense odours is known as general anosmia; individuals 
born with this phenotype are afflicted with congenital general anosmia. 
Except for some syndromic cases such as Kallmann syndrome, no 
causative genes for human congenital general anosmia have been iden- 
tified so far’ ’. Nine mammalian genes encoding voltage-gated sodium 
channel «-subunits have been cloned and shown to be differentially 
expressed in the nervous system**. Of these, SCN9A, encoding the 
tetrodotoxin (TTX)-sensitive sodium channel Na,1.7, has received 
specific attention because of its key role in human pain perception. 
Individuals carrying loss-of-function mutations in SCN9A are unable 
to experience pain, and an essential requirement of Na,1.7 function for 
nociception in humans has been established*’. Whether all other 
sensory modalities are fully preserved in these individuals remained 
unclear, although an association between congenital inability to experi- 
ence pain and sense of smell deficits has been suggested’. In this study 
we examine human patients carrying SCN9A loss-of-function muta- 
tions and demonstrate that they fail to sense odours. We establish a 
mouse model of congenital general anosmia and provide mechanistic 
insight into the role of Na,1.7 in olfaction. Together with previous 
findings® *”°, our results establish that loss-of-function mutations in 
a single gene, SCN9A, cause a general loss of two major senses— 
nociception and smell—thus providing a mechanistic link between 
these two sensory modalities. 


Requirement for Na,1.7 in human olfaction 


Three individuals with congenital analgesia were ascertained and 
studied. All three were in their third decade of life and had never 
experienced acute pain but had no other neurological, cognitive, 
growth, appearance or health problems. All had suffered from multiple 
painless fractures and other injuries. Two had given birth painlessly. A 
working diagnosis of channelopathy-associated insensitivity to pain 
(CAIP) was made and in each SCN9A was sequenced’. In the first, 


who has been the subject of a detailed case report, the mutations 
c.774_775delGT and c.2488C>T were found!®. These mutations, 
frameshift and nonsense, respectively, would be predicted to lead to a 
lack of functional Na,1.7 protein. The other two were siblings and had 
the mutations c.4975A>T and c.3703delATAGCATATGG; again, 
nonsense and frameshift mutations and predicted to lead to no func- 
tional Na,1.7 protein. The mother of the siblings was found to be 
heterozygous for the 11-base-pair deletion and the father heterozygous 
for the nonsense mutation. Therefore the diagnosis of CAIP was sub- 
stantiated. We next assessed their sense of smell; none complained of 
having no sense of smell, one had been a cigarette smoker, none had 
chronic nasal problems. In the first woman smell function was assessed 
by using the University of Pennsylvania Smell Identification Test 
(UPSIT), a standardized 40-item smell test. The results revealed that 
she was unable to detect any of the odours (Fig. 1a, black bar). Nine 
healthy, young individuals served as controls (Fig. 1a, grey bars). In the 
sibling pair we assessed the parents and their two affected offspring 
together. All were tested in sequence with cotton wool pads suffused 
with selected odour stimuli: balsamic vinegar, orange, mint, perfume, 
water (control) and coffee. Both parents correctly identified all stimuli, 
including smelling nothing for the water. The siblings detected none of 
the odours. For the siblings the test was repeated using subjectively 
unpleasant amounts of balsamic vinegar and perfume: the parents 
identified the odours correctly and found them unpleasant; the siblings 
neither identified the odours nor experienced any discomfort. 

We proposed that these odour-sensing deficits are caused by loss of 
Na,1.7 function in olfactory sensory neurons (OSNs). Indeed, when 
we investigated expression of Na,1.7 in normal human olfactory 
epithelium, we detected messenger RNA for Nayl.7 and the GTP- 
binding protein Go, a prototypical signature of classical OSNs 
(Fig. 1b). Immunohistochemistry using an antibody specific to Na,1.7 
verified that Nay1.7 is normally expressed in human OSNs (Fig. Ic, d). 
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Figure 1 | Na,1.7 in human olfaction. a, Quantified olfactory assessment of 
the first individual with confirmed Na,1.7 loss-of-function mutations (black 
bar) using the standardized, 40-item UPSIT test showed that she was unable to 
detect any of the odour stimuli; the test score revealed general anosmia in this 
patient. Nine healthy, young individuals served as controls (grey bars). We 
assessed odour perception in two other individuals with confirmed Na,1.7 loss- 
of-function mutations and both were unable to sense any of the odours. These 
results are described in the main text. b, Expression of Na,1.7 in olfactory 
epithelium from unaffected normal humans. RT-PCR products with gene- 
specific primers for human Nay1.7 (top; size, 1,128 bp) and the G-protein Gooig 
(bottom; size, 143 bp). PCRs were carried out with equal amounts of RNA in the 
presence (+RT) or absence (—RT) of reverse transcriptase to exclude product 
amplification from genomic DNA. M1, size marker; 2,000 bp, 850 bp; M2, size 
marker; 400 bp, 100 bp. Similar results were obtained in two other human 
olfactory mucosa samples. c, Confocal fluorescence image of Na,1.7 


immunoreactivity (red) in a cryosection of human olfactory epithelium. Scale 
bar, 20 um. d, Enlargement showing a single OSN. Scale bar, 5 jum. 
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Figure 2 | Na,1.7 expression in the mouse main olfactory system. a, Nay1.7 
immunoreactivity (red) detected in the olfactory bulbs (OB, top) and the 
underlying main olfactory epithelium (MOE, bottom). B6 mouse, 2-days old. 
b, Colocalization of Na,1.7 (red) and OMP (green) expression in the ONL and 
glomerular layer (GL) of the olfactory bulb. B6 mouse, 3-weeks old. c, Strong 
Na,1.7 immunoreactivity in OSN axon bundles (arrowheads). B6 mouse, 
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Conditional Na,1.7 null mice 


To investigate the mechanisms that underlie the essential role of Na,1.7 
in odour perception, we first examined Na,1.7 expression in the mouse 
olfactory system and then used the Cre-loxP system to delete the channel 
in those cells that express olfactory marker protein (OMP), which 
includes all classical OSNs''. These mice enabled us to examine the 
mechanisms underlying Na,1.7-associated anosmia and the behavioural 
consequences. Consistent with our findings in human OSNs, OSNs 
from wild-type mice (C57BL/6, referred to as B6) showed Nayl.7 
immunoreactivity at their somata (Supplementary Fig. 1). Of greater 
interest, coronal sections containing main olfactory epithelium (MOE), 
olfactory nerves and the two olfactory bulbs revealed the most marked 
Na,1.7 staining in the superficial olfactory nerve layer (ONL, containing 
axons from OSNs) as well as the glomerular layer (a complex neuropil 
that includes the presynaptic OSN boutons) of the olfactory bulb 
(Fig. 2a—c). Higher magnification of individual glomeruli verified co- 
expression of Nay1.7 with OMP in the glomerular neuropil (Fig. 2b), 
whereas olfactory bulb projection neurons (the mitral/tufted or M/T 
cells) and local interneurons did not show Na,1.7 immunoreactivity 
(Fig. 2a). Thus, Na,1.7 occupies a critical presynaptic location at the 
first synapse in the olfactory system. 

Na,1.7 is not the sole Na, channel expressed in mouse OSNs. Real- 
time quantitative polymerase chain reaction with reverse transcrip- 
tion (qRT-PCR) analysis identified Na,1.3 as an additional candidate 
(Supplementary Fig. 2) and immunohistochemistry verified its 
expression in OSNs and their axons (Fig. 2d). However, unlike 
Na,1.7 we did not observe Na,1.3 immunoreactivity in individual 
glomeruli (Fig. 2d), indicating that Na,1.7 could be the sole Na, channel 
underlying action potential propagation in olfactory glomeruli and 
OSN nerve terminals. 

To create a conditional knockout mouse model, we crossed ‘floxed’ 
Nayl.7 mice harbouring a JoxP-flanked Scn9a gene’? to homozygous 
OMP-Cre mice in which the OMP-coding region is replaced by that of 
Cre recombinase’. Further breeding established offspring that were 
both homozygous for the floxed Scn9a alleles and heterozygous for cre 
and Omp. In these mice, Cre-mediated Na,1.7 deletion was restricted 
to OMP-positive cells (henceforward referred to as cNayl.7_/~ mice). 
These mice lacked Na,1.7 expression in a tissue-specific manner 


Nav1.7 


cNa1.7-- 


3-weeks old. d, Na,1.3 expression (red) terminates in the ONL and is not 
detectable in OMP-labelled (green) glomeruli. OMP-GFP mouse, 16-days old. 
e-f, cNa,1.7_/~ mice lack Nayl.7 immunoreactivity in the ONL and 
glomerular layer of the olfactory bulb (e) as well as in OSN axon bundles of 
MOE (f). Dashed circle, individual glomerulus. Scale bars: a, 200 um; 

b, c, e, f, 100 um; d, 50 um. 
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(Fig. 2e, f and Supplementary Fig. 3). Successful matings occurred 
between cNa,1.7*’~ males and cNa,1.7_/~ females whereas homo- 
zygous knockout pairs did not produce any offspring. cNay1.7~~ mice 
showed a reduced body weight during the first three months of 
postnatal development (Supplementary Fig. 4a). Because both 
Na,1.7 (refs 14, 15) and OMP” are also expressed in some neurons 
mediating hormonal regulation, we assayed insulin-like growth factor 
(IGF-1, also known as somatomedin C) in cNa,1.7-/~ (650 + 94 ng 
ml~!; n=4) versus cNa,1.7*’~ mice (684+ 27 ng ml |; n=4; 
mean + s.d.) but found no significant difference between the two geno- 
types (P = 0.26). Given that newborn cNayl.7/~ mice had very little 
milk in their stomachs (Supplementary Fig. 4b), the diminished weight 
gain was probably caused by a deficit to suckle effectively, consistent 
with results in mice deficient in Go.gi¢ (Gnal) or the cAMP-gated cation 
channel (Cnga2)'”"*. 


Loss of synaptic transfer in olfactory glomeruli 


To define the function of Na,1.7 in OSNs, we prepared MOE tissue 
slices’? and recorded sodium currents in voltage-clamped OSNs. Both 
Nayl.7*/~ and Nayl.7~/~ OSNs displayed sizeable, TTX-sensitive 
sodium currents in response to step depolarizations (Fig. 3a, b). On 
the basis of its biophysical properties, Na,1.7 has been suggested to 
transduce generator potentials into action potentials in sensory neu- 
rons’. However, peak current densities of voltage-activated sodium 
currents were reduced only moderately, by about 20%, in Nayl.7 /~ 
OSNs (Fig. 3b). To determine whether Na,1.7_'~ OSNs could still 
produce odour-evoked action potentials, we used extracellular loose- 
patch recording from visually identified OSN dendritic knobs” and 
analysed spike frequency histograms after brief odour exposure 
(Fig. 3c). There was no obvious difference in odour responsiveness 
in Na,1.7 ‘~ versus Nayl.7*/” OSNs (Fig. 3c). We obtained similar 
results when we stimulated the cells with 3-isobutyl-1-methylxanthine 
(IBMX)?', which raises intracellular cAMP by inhibiting endogenous 
phosphodiesterase activity (Fig. 3c). Thus, although the initial site of 
odour-evoked action potential generation in OSNs is unknown, Na,1.7 
is not essential for this activity. 

Because Na,1.7 is expressed in olfactory bulb glomeruli (Fig. 2), we 
reasoned that it could be required for action potential conduction in 
OSN terminals. Olfactory glomeruli are delineated spheres of neuro- 
pil containing synapses from the OSN axon terminals onto juxtaglo- 
merular interneurons and M/T projection neurons””’. To examine 
whether presynaptic activity of Nay1.7 underlies transmitter release in 
the olfactory glomerulus, we prepared olfactory bulb tissue slices”* 
and combined ONL focal electric stimulation with whole-cell 
patch-clamp recording from visually identified M/T cells. With the 
chosen protocol, in control cNa,1.77/ ~ mice a single electrical stimu- 
lus in the ONL produced a reliable postsynaptic response in M/T cells. 
Under current clamp, such responses consisted of a prolonged excita- 
tion lasting on average for 2.4+0.4s (Fig. 3d, top; n = 29), with 
response latencies of 22 + 4 ms (n = 29). Under voltage clamp, we 
observed bursts of postsynaptic currents (Fig. 3f; duration, 3.2 + 0.4 s; 
n = 26). In stark contrast, in the cNay1.7 ‘~ mice such postsynaptic 
responses were virtually absent in M/T cells, even when the stimulus 
strength was increased by several-fold (Fig. 3d—f, n = 49). Importantly, 
MIT cells in these mice still produced normal action potentials when 
depolarized via current injection through the patch pipette (Fig. 3d, 
bottom), consistent with the fact that M/T cells lack both OMP and 
Na,1.7 expression (Fig. 2) and indicating that the effect of deleting 
Na,1.7 is presynaptic to the M/T cells. The inability of M/T cells to 
produce synaptic responses to ONL stimulation was not due to a poten- 
tial deficit in synapse formation because: (1) immunohistochemistry 
showed normal expression of the vesicular glutamate transporter 2 
(vGluT2, which is selectively expressed in OSN axon terminals)*°”° 
(Supplementary Fig. 5); and (2) electron microscopy revealed the exist- 
ence of normal OSN boutons and synapses in the glomeruli of 
cNa,1.7-/~ mice (Supplementary Fig. 6). Furthermore, conditional 
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Figure 3 | Na,1.7 is essential for synaptic transfer in the olfactory 
glomerulus. a, Families of whole-cell currents to a series of depolarizing voltage 
steps recorded from cNa,1.7 /~ OSNs. OSNs were exposed successively to 
extracellular bath solution (control), bath solution containing Cd?* (100 uM), 
and bath solution containing Cd** (100 uM) and TTX (2 uM). Holding 
potential, —70 mV. b, Current-density—voltage curves of sodium currents from 
cNayl.77/— (black; n = 35) and cNayl.7_/~ OSNs (red; n = 46). Current 
densities of Na,1.7*/~ OSNs were significantly diminished between —30 mV 
and 10 mV (LSD, P = 0.001-0.04). ¢, Action potential responses in visually 
identified OSN dendritic knobs to pulsed stimulation (0.2 s) with cineole 
(100 }1M) or IBMX (100 1M) in cNay1.7*!~ (black; n = 28 and 78 cells, 
respectively) and cNa,1.7_‘~ OSNs (black; n = 5 and 25, respectively). Firing 
properties were similar in both genotypes (LSD, P = 0.14-0.73). d, M/T cells 
exhibit postsynaptic potentials to presynaptic nerve stimulation (NS) in 
cNay1.7*/~ (black), but not in cNa,1.7~/~ mice (red), whereas M/T cells in 
both genotypes show normal action potentials to current injection (200 pA). 
Current-clamp whole-cell recording. e, Absence of M/T cell postsynaptic 
potentials to nerve stimulation in cNayl.7-/~ (red; n = 44) versus cNa,1.7 
(black; n = 29) (LSD, P< 0.01-0.001). f, Example of postsynaptic currents in 
MIT cells of cNayl.7*/~ (black) and cNayl.7~/~ mice (red). g, Analysis of area 
under curve (AUC) of M/T cell postsynaptic currents during a 5 s interval after 
nerve stimulation in cNa,1.7*'~ (black) and cNa,1.7_/~ mice (red). Number of 
cells tested is shown in brackets above each bar. Unpaired t-test: ***P < 0.0001. 
h, Marked reduction of TH expression in juxtaglomerular cells of the olfactory 
bulb in cNay1.7/~ mice. EPL, external plexiform layer; ML, mitral cell layer. 
Arrows in the inset (+/+) indicate individual juxtaglomerular cells. Scale bars: 
overview, 100 tum; inset, 50 um. Error bars represent mean = s.e.m. 
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OSN expression of tetanus toxin light chain, which inhibits synaptic 
release, does not alter the pattern of axonal targeting in olfactory bulb 
glomeruli during development”. 

Tyrosine hydroxylase (TH) expression in juxtaglomerular neurons 
of the olfactory bulb, a correlate of afferent trans-synaptic activity, 
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requires olfactory nerve input and odour-stimulated glutamate 
release by OSN terminals”. Consistent with a loss of OSN synaptic 
release, TH expression was markedly reduced in cNa,1.77/ ~ mice 
(Fig. 3h; n = 6). The level of TH downregulation was similar to that 
observed after odour deprivation by naris occlusion” or after deletion 
of the Cnga2 cation channel gene’’. Thus, we conclude that the pres- 
ence of Na,l.7 in OSN axons is an essential and non-redundant 
requirement to initiate information transfer from OSN terminals to 
neurons in the olfactory bulb. 


The absence of odour-guided behaviours 


To further validate these results, we investigated several odour-guided 
behaviours in B6, cNa,1.77/~ and cNa,1.7/~ mice. First, we per- 
formed an odour preference test*' to assess recognition abilities for 
innate odour qualities (Fig. 4a). Filter papers scented with various 
cues representing both species-specific and food odours (male and 
female urine, peanut butter, milk) were presented to the mice and 
investigation times were analysed. Water was used as a neutral stimu- 
lus and 1,8-cineole (eucalyptol), which does not evoke innate attrac- 
tion, served as the control (n = 7 for each cue and strain, respectively). 
B6 and cNay1.7*’~ mice both showed strong attraction towards con- 
specific and food odours, whereas cNay1 .7'~ mice failed to show any 
interest in these stimuli. 

Second, we explored whether Na,1.7 is required for innate avoid- 
ance behaviour towards a predator odour, trimethyl-thiazoline 
(TMT)*', which is normally secreted from the fox anal gland and 
known to induce aversive behaviour and fear responses in mice. We 
observed robust avoidance behaviour in both B6 (n=6) and 
cNa,1.7*'~ mice (n =5) but, notably, cNa,1.7-/~ mice lacked an 
innately aversive response in this assay (n = 5; Fig. 4b, c). 

Third, we investigated the performance of cNa,1.7~/ ~ mice ina 
habituation-dishabituation assay, which allows for measurement of 
novel odour investigation, short-term odour learning, and odour dis- 
crimination” (Fig. 4d). Mice of both sexes were each presented three 
distinct stimuli (water, female urine, male urine), each delivered for 
three successive trials, and investigation time during each trial (3 min) 
was analysed. Consistent with the results of Fig. 4a, CNa,1.7-/~ mice 
(n = 8) failed to show significant odour investigation, habituation, or 
discrimination abilities when compared with B6 (n = 8) or cNay1 ee 
mice (n = 8) (Fig. 4d; least significant difference (LSD), P < 0.0001). 

Last, we examined pup retrieval ability of female mice, a social 
behaviour that probably depends on a functional main olfactory sys- 
tem (Fig. 4e). Three pups of a litter were removed from the nest, 
randomly distributed in the cage, and the time to retrieve each pup 
into the nest was quantified. In contrast to the performance of B6 
(n = 12) or cNa,1.7*/~ mice (n = 6), CNa,1.7_/~ mice (n = 5) failed 
to retrieve any of the three pups during a 10-min trial period (Fig. 4e). 


Conclusions and prospects 


Our results establish a critical role of the Na,1.7 sodium channel in 
olfaction. Using conditional Na,1.7 null mice, we demonstrate that, in 
the absence of Na,1.7, OSNs are still electrically active and generate 
odour-evoked action potentials but fail to initiate synaptic signalling 
to the projection neurons in the olfactory bulb. These results provide 
evidence that Na,1.7 is an essential and non-redundant requirement 
for action potential propagation in the sections of OSN axons within 
the olfactory glomerulus. The conditional null mice no longer show a 
wide range of vital, odour-guided behaviours including innate attrac- 
tion to food and conspecific odours, odour discrimination and short- 
term odour learning, innate avoidance towards a predator odour, 
effective suckling behaviour of newborn pups, and maternal pup 
retrieval. Within the limits of our anatomical analyses, synapse forma- 
tion in these mice appears normal, indicating that the behavioural 
phenotype of the mutant mice is most likely the result of a loss of 
signalling at the first synapse in the olfactory system. Whether Na,1.7 
or other sodium channel subunits such as Na,1.3 are involved in OSN 
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Figure 4 | cNa,1.7/~ mice are anosmic. a, Innate olfactory preference test. 
Mean investigation times were quantified in B6, cNa,1.7°/~ and cNayl.7/~ 
mice during a 3-min test period. Mean investigation time for water in B6 and 
cNa,1.7*/~ mice (dashed line) served as a marker for attraction. Number of 
animals tested is shown in brackets above each bar. LSD: *P < 0.02; 

**D < (001; ***P < 0.0001. No difference was observed between B6 and 
cNay1.77/— mice (LSD, P = 0.1). b, ¢, Innate olfactory avoidance to the 
predator odour TMT. b, Examples of the trajectory plot of the position of a 
cNayl.7*/~ anda cNay1.7 /~ mouse (30-min video-tracking). Location of 
TMT (5 ul) indicated by the circle in the upper right corner. c, Quantification of 
time spent in either area 1 (—) or area 2 (+) during exposure to TMT indicates 
that cNa,1.7-/~ mice lack avoidance behaviour to TMT (LSD, P = 0.67). LSD: 
***P <0).0001. d, Olfactory habituation—dishabituation assay. Mean 
investigation time of B6, cNayl.7*/~ and cNa,1.77/~ mice (n = 8 for each 
group) to three distinct stimuli (water, female and male mouse urine) during a 
3-min test period were quantified. Numbers indicate stimulus presentation 
order. e, Pup retrieval test. Mean latencies of B6, cNa,1.77/— and cNay1.7/ ~ 
female mice in retrieving three individual pups that were randomly distributed 
throughout the cage. cNay1.7 ‘~ mice failed to retrieve any of the pups during a 
10-min trial period (LSD, P < 0.0001). NS, not significant (LSD, P = 0.41). 
Error bars represent mean + s.e.m. 


axon pathfinding and activity-dependent neural map formation® in 
the mouse olfactory system remains to be seen. Importantly, the 
phenotype of the mutant mice—the inability to perceive odours—is 
similar to that observed in human patients with confirmed Na,1.7 
loss-of-function mutations. Smell tests in three individuals with con- 
genital analgesia establish that they are unable to sense any of the 
odours. Systematic olfactory testing of patients carrying Na,1.7 
loss-of-function mutations will be required in the future. 

The genetic basis of sensory deficits such as blindness, deafness and 
pain disorders has been extensively studied in recent years. By com- 
parison, relatively little progress has been made in understanding 
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human congenital general anosmia’. Mutations in olfactory signal 
transduction genes such as CNGA2, GNAL and ADCY3 do not seem 
to be a major cause of human congenital general anosmia*. The iden- 
tification of a sodium channel subunit as a causative gene for an 
inherited form of general anosmia provides new insight into the 
molecular pathophysiology of olfaction and should stimulate further 
research aimed at understanding the genetic basis of the human sense 
of smell. 


METHODS SUMMARY 


Human subjects. All research involving humans was obtained with the informed 
consent of the patients and performed under protocols approved by the Ethics 
Committee of the relevant institution. Human nasal mucosa was obtained by 
biopsy during routine nasal surgery. Further details of the human smell tests can 
be found in Methods. 

Animals. The relevant Institutional Animal Care and Use Committee approved 
all procedures. Experiments used tissue-specific Nay1.7-deficient, C57BL/6J and 
OMP-GFP mice. See Methods for details. 

PCR analyses. Human olfactory mucosa samples were examined individually 
whereas mouse tissue was pooled from four different animals. PCR products were 
amplified with gene-specific primers and specificity was controlled by sequen- 
cing. Primers are specified in Methods. 

Immunohistochemistry and electron microscopy. These followed previously 
published procedures****° as described in Methods. 

Electrophysiology. We used intact mouse MOE preparations as described previ- 
ously’*”°. Details of the electrophysiological experiments” are given in Methods. 
Behavioural analyses. Innate olfactory preference tests and avoidance measure- 
ments used methods similar to those previously described*'*’. Olfactory habituation- 
dishabituation followed previously described procedures”. Details are given in 
Methods. 

IGF-1 assays. These were done as described in Methods. 

Statistics. Details of the statistical tests are given in Methods. Unless otherwise 
stated, results are presented as Means = s.e.m. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Human biopsies. Human nasal mucosa was obtained by biopsy during routine 
nasal surgery with patients under general anaesthesia. Biopsy specimens were 
obtained from three individuals and snap-frozen in liquid nitrogen for later 
processing. All samples were obtained under a protocol approved by the Ethics 
Committee of the University of Saarland School of Medicine. All biopsy tissues 
were obtained with the informed consent of the patients. 

Human psychophysics. The UPSIT was obtained from Sensonics. The test was 
applied over a period of 25 min. Testing and scoring was done according to 
standardized operating procedures summarized in the test manual. The reference 
values have been derived from recorded reference ranges for the UPSIT test based 
on British individuals. 

Olfactory mucosa biopsies and PCR analyses. Human surgical material con- 
taining olfactory mucosa collected from three different patients was examined 
individually. RT-PCRs from human samples were performed on a MyCycler 
(BIO-RAD) with Herculase (Agilent Technologies) following suppliers’ instruc- 
tions. To amplify human Goie we used the oligonucleotides TGGAAAGA 
ATCGACAGCGTCAGC and GGCCACCAACATCAAACATGTGG. Human 
Nay1.7 was amplified by CATGAATAACCCACCGGACTG and CCTATGCC 
CTTCGACACCAAGG. PCR conditions were: 95 °C for 2 min pre-denaturation, 
followed by 35 cycles (95 °C for 30 s, 60 °C for 30 s (Gooig) or 1 min (Nay1.7), 72 °C 
for 30s), followed by a final extension 72 °C for 5 min. Mouse tissue was pooled 
from four different B6 mice (4-8 weeks old). RNA was isolated with the 
InnuPREP RNA isolation kit (Analyticjena). RNA quality was assessed by gel 
electrophoresis and photometric measurements. cDNA was synthesized from 
0.5 ug of total RNA using the Smart cDNA Synthesis technology (Clontech) 
and Supercript II reverse transcriptase (Invitrogen). qPCR for different mouse 
Nay subunits were done on a My-iQ-cycler using iQ™ SYBRGreen Supermix 
following the supplier’s instructions (BIO-RAD). We used the following oligo- 
nucleotides: Nay1.1 (AGCCTGGTAGAACTTGGCCTTGC and TGCCAACCA 
CGGCAAAAATAAAG); Nayl.2. (TGGGATCTTCACCGCAGAAATG and 
TGGGCCAGGATTTTGCCAAC); Na,yl.3 (AGCTTGGCCTGGCAAACGTG 
and ATGCCGACCACGGCAAAAATG); Nayl.5 (ACAGCCGAGTTTGAG 
GAGATGC and CGCTGATTCGGTGCCTCA); Nay1.6 (ACGCCACAATTC 
GAACATGTCC and CCTGGCTGATCTTACAGACGCA); Nay1.7 (ACGGAT 
GAATTCAAAAATGTACTTGCAG and GITCTCGTTGATCTTGCAAACA 
CA). PCR conditions were: 95°C for 3 min pre-denaturation, followed by 42 
cycles 95 °C for 30s, 64°C for 20s, 72 °C for 30s. Each reaction was performed 
in three replicates on 96-well plates and analysed with the iQ5 Software (BIO- 
RAD). Specificity of all PCR products was confirmed by gel electrophoresis and 
sequencing. 

Mice. Animal care and experimental procedures were performed in accordance 
with the guidelines established by the animal welfare committee of the University 
of Saarland School of Medicine. Mice were kept under a standard light/dark cycle 
with food and water ad libitum. Tissue-specific, Na,1.7-deficient mice were gen- 
erated by crossbreeding ‘floxed’ Na,1.7 mice that carry two loxP sites, flanking 
exons 14 and 15 of Scn9a'* with homozygous OMP-Cre mice (B6;129P2- 
Omp'"™“)o™ MomJ) that express Cre recombinase under the control of the 
OMP promoter’. Further breeding established offspring that were both homo- 
zygous for the floxed Na,1.7 alleles and heterozygous for cre and Omp. In these 
mice, Cre-mediated Nay1l.7 deletion was restricted to OMP-positive cells. 
Additionally, C57BL/6J (B6) and OMP-GFP (B6;129P2-Omp"”*”°"/Mom]J) 
mice were used. 

Immunohistochemistry. Perfusion of mice and preparation of mouse olfactory 
tissues for immunohistochemistry followed previously described methods”. 
Cryosections (10-12 1m) of either human or mouse olfactory tissues were post- 
fixed using 4% paraformaldehyde in PBS, before blocking and antibody admin- 
istration. Primary antibodies were: mouse-specific anti-Na,1.7 (1:500, rabbit 
polyclonal; Millipore), human-specific anti-Nayl.7 (1:500, rabbit polyclonal; 
Abcam), Nay1.3 (1:500, rabbit polyclonal; Millipore), OMP (1:3,000, goat poly- 
clonal; gift of F. Margolis), vGluT2 (1:2,000, rabbit polyclonal; Synaptic Systems), 
tyrosine hydroxylase (TH, 1:3,000, mouse monoclonal; ImmunoStar). Secondary 
antibodies and conjugated compounds were: Alexa-Fluor 488 donkey-anti-goat 
(1:1,000; Invitrogen), Alexa-Fluor 555 donkey-anti-rabbit (1:1,000; Invitrogen), 
Alexa-Fluor 546 Streptavidin (1:200; Invitrogen). Procedures were conducted at 
room temperature (21°C), except for incubation in primary antibodies (4 °C). 
Expression of Nayl.7 in human was detected by direct immunofluorescence. 
Expression of Nay1.3 and Nay1.7 in mouse was detected by tyramid signal amp- 
lification using manufacturer’s protocol (TSA-Biotin System, Perkin Elmer). 
Incubation in primary antibody was for 2-3 days, in biotinylated anti-rabbit 
antibody (1:400; Jackson ImmunoResearch) for 1h, in streptavidin-HRP 
(1:100) for 30 min, in biotinylated tryamid (1:100) for 10 min, and visualized 
using Alexa 546-conjugated steptavidin (Invitrogen, 1:200). OMP colocalization 


was detected using a Alexa 488-conjugated anti-goat secondary antibody. 
Detection of vGluT2 was exactly as previously described’®. TH was detected in 
30-uum free-floating sections using the avidin-biotin method (Vectastain ABC- 
Elite, Vector). Incubation in primary TH antibody was for 1 day, in biotinylated 
horse-anti-mouse secondary antibody (1:400, Vector Laboratories) for 1 h, and in 
avidin/biotin-HRP complex (Vector) for 90 min. Immunoreactivity was visua- 
lized with 0.05g1~' 3,3’-diaminobenzidine and 0.015% HO . Fluorescence 
images were acquired on either a BX71 microscope attached to a DP71 camera 
(Olympus) or an LSM 710/ConfoCor-3 microscope (Zeiss). Image stacks are 
presented as maximum intensity projections, assembled and minimally adjusted 
in brightness using Adobe PhotoShop 6.0. 

Electron microscopy. Following routine processing for electron microscopy, as 
previously described’*”®, thin 70-100-nm sections were cut on a Reichert Ultracut 
E and examined on a JEOL 1200 transmission electron microscope. Images were 
captured at 12,000, digitized at 1,200 dots per inch (DPI), and examined for 
ultrastructural features of the olfactory sensory axons and their synaptic terminals. 
Electrophysiology. Whole-cell patch-clamp recordings from individual OSNs 
were obtained in acute MOE tissue slices of P1-P5 mice’. The anterior aspect 
of the head containing olfactory epithelium and bulb was embedded in agarose 
(4%), placed in oxygenated, ice-cold extracellular solution (95% O2, 5% CO2) 
containing: 120 mM NaCl, 25 mM NaHCOs, 5 mM KCl, 5 mM BES (N,N-bis[2- 
hydroxyethyl]-2-aminoethansulphonic acid), 1mM MgSO,, 1mM CaCl, 
10mM glucose, osmolarity adjusted to 300mOsm, pH 7.3. Coronal slices 
(250 tm) were cut on a vibratome (Microm HM 650 V), transferred to a record- 
ing chamber and kept under continuous flow (2 ml min’) of oxygenated solu- 
tion or remained on ice in oxygenated solution until needed (for up to 4h). 
Experiments were performed at room temperature. The CsCl-based electrode 
solution contained: 140 mM CsCl, 1 mM EGTA, 10mM HEPES, 0.5mM GTP 
Na-salt, 2mM ATP Mg-salt, pH 7.1, 290 mOsm. To assess OSN firing properties 
under non-invasive conditions, we used extracellular loose-patch recording from 
OSN knobs”. In this case, the septal epithelium of juvenile (P1-P5) or adult mice 
was dissected and transferred to a recording chamber. Patch pipettes (9-12 MQ) 
were filled with a HEPES-based extracellular solution containing: 140 mM NaCl, 
5mM KCl, 1 mM MgCl, 1 mM CaCl, 10 mM HEPES, pH 7.4, 300 mOsm. IBMX 
was prepared in 10mM stock solution containing 5% dimethylsulphoxide 
(DMSO) (v/ v). For M/T cell recordings, brains were rapidly dissected in ice-cold 
oxygenated (95% O», 5% CO ) solution containing: 83mM NaCl, 26.2mM 
NaHCO;, 1mM NaH,PO,, 2.5mM KCl, 3.3mM MgSO,, 0.5mM CaCh, 
70 mM sucrose, pH 7.3, 300 mOsm. Horizontal olfactory bulb slices (300 um) 
were cut in this solution. Until use, slices were transferred to oxygenated modified 
artificial cerebrospinal fluid (ACSF, 95% O2, 5% CO>) containing: 125 mM NaCl, 
25mM NaHCO;, 2.5mM KCl, 1.25 mM NaH,PO,, 1mM MgCh, 2mM CaCl, 
and 25mM glucose. Recording pipettes had resistances of 4-7 MQ. M/T cells 
were identified by size and location of their somata and filled with Lucifer Yellow 
during patch recording. The intracellular solution contained: 140 mM KCl, 1 mM 
EGTA, 10 mM HEPES, 1 mM ATP Na-salt, 0.5 mM GTP Mg-salt, 0.1 mM Lucifer 
Yellow; pH 7.1, 290 mOsm. M/T cells were held at —55 to —60 mV. Input and 
series resistances were 200-300 MQ and 15-20 MQ, respectively. After establish- 
ing a whole-cell recording, the ONL was stimulated using a glass electrode 
(1-1.5 MQ) filled with HEPES-buffered extracellular solution connected to an 
electrical stimulator (single stimulus: 20 ms, 40 V, 266-400 1A). The stimulus 
pipette was placed rostrally to the recorded cell in the ONL. If a given M/T cell 
showed no postsynaptic response, the position of the stimulus pipette was changed 
until OSN axon bundles were found that caused M/T cell responses. Ionic currents 
were analysed using PulseFit 8.54 (HEKA) and IGOR Pro software (Wavemetrics)”*. 
OSNs with leak currents >20 pA and M/T cells with leak currents >100 pA (all 
measured at —70mV) were excluded from analysis. Cell capacitance (C,,) was 
monitored using the automated function of the EPC-9 amplifier. A stable C,, value 
over time was an important criterion for the quality of an experiment. Spike analysis 
was done off-line using IGOR Pro software with custom-written macros. Chemicals 
were purchased from Sigma unless otherwise stated. Drugs used in the electro- 
physiological experiments were prepared as stock solutions in DMSO or distilled 
water and diluted to the final concentration in HEPES-based extracellular solution. 
NaCl, MgCl, glucose and CaCl, were from Merck. IBMX (100 1M) and cineole 
(100 tM) were diluted in a HEPES-buffered extracellular solution (<0.1% DMSO) 
and focally ejected using multibarrel stimulation pipettes. 

Behavioural tests. The innate olfactory preference test followed previously 
described procedures*’. Briefly, mice were habituated to the test conditions before 
odour exposure. Mice were individually placed in an empty cage for 30 min and 
then transferred to a new cage. This habituation was repeated three to four times 
for each animal. Soon after habituation, mice were transferred to the test cage, and 
a filter paper scented with a test odorant was introduced. Investigation times of 
the filter paper during the 3-min test period was recorded and quantified. Odour 
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stimuli were freshly collected male and female B6 mouse urine (5 il), peanut 
butter (10% w/v, 15 pl), milk powder (10% w/v, 15 ul), water (15 pl) and cineole 
(100 uM, 15 pl). 

For the innate olfactory avoidance test, following habituation (see innate pref- 
erence test), a filter paper scented with 5 pl TMT (7.6mM) was placed in one 
corner of the test cage. Mouse behaviour was recorded for 30 min. The test cage 
was subdivided into three equally sized areas. Time spent in area 1 of the cage 
(farthest distance from the TMT source) was evaluated as avoidance, whereas 
time spent in area 2 (consisting of the TMT source) was evaluated as attraction”’. 
Animal movements were tracked with SwisTrack (Swarm Intelligent Systems 
Group, Swiss Federal Institute of Technology). 

For the olfactory habituation-dishabituation assay, following habituation (see 
innate preference test) mice were exposed for 3 min to distilled water (15 ul). This 
procedure was repeated three times with 1-min intervals, followed by a three-time 
presentation of female urine (5 tl) and a three-time presentation of male urine 
(5 ul). Investigation times during the 3-min test periods were measured. 

For the pup retrieval test, lactating mice were habituated to the experiment for 
several minutes. Experiments were performed in the bedded home cages of the 
dams. Three pups (1-3-days old) were removed from the nest and randomly 
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distributed in the cage. The latency for pup retrieval back into the nest was 
measured. If a dam had not completed retrieval within 10 min the test was 
terminated, resulting in a latency of 600s. 

Experiments were performed in empty standard cages (38 X 19 X 12 cm) and 
test substances were applied on filter paper (~1 X 3 cm). Mouse behaviour was 
recorded with a digital camera (Sony) for the experimental times indicated. 
Statistical video analyses were done randomly and blindly. Peanut butter 
(Barney’s Best) and milk powder (Bio-Anfangsmilch, Hipp) were diluted to 
10% (w/v) in water. 

IGF-1 assays. IGF-1 levels were measured by sandwich ELISA (ALPCO 
Diagnostics). IGF-1 was dissociated from the binding proteins by diluting sam- 
ples with an acidic buffer. The analytical sensitivity of the assay was 0.029 ng 
ml |. Inter and intra-assay variability was below 7%. Experiments used plasma of 
4-5-weeks-old mice (n = 4, each genotype). 

Statistics. Data were analysed using NCSS 2004 statistical software (NCSS). The 
Student’s t-test (two-tailed) was used for measuring the significance of difference 
between two distributions. Multiple groups were compared using a one-way or 
two-way analysis of variance (ANOVA) with Fisher’s LSD as a post hoc compar- 
ison. Unless otherwise stated, results are presented as means + s.e.m. 
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